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Preface to the Second Edition 


This book presents a general introduction to enumerative, bijective, and algebraic combina- 
torics. Enumerative combinatorics is the mathematical theory of counting. This branch of 
discrete mathematics has flourished in the last few decades due to its many applications to 
probability, computer science, engineering, physics, and other areas. Bijective combinatorics 
produces elegant solutions to counting problems by setting up one-to-one correspondences 
(bijections) between two sets of combinatorial objects. Algebraic combinatorics uses com- 
binatorial methods to obtain information about algebraic structures such as permutations, 
polynomials, matrices, and groups. This relatively new subfield of combinatorics has had 
a profound influence on classical mathematical subjects such as representation theory and 
algebraic geometry. 

Part I of the text covers fundamental counting tools including the Sum and Product 
Rules, binomial coefficients, recursions, bijective proofs of combinatorial identities, enumer- 
ation problems in graph theory, inclusion-exclusion formulas, generating functions, ranking 
algorithms, and successor algorithms. This part requires minimal mathematical prerequisites 
and could be used for a one-semester combinatorics course at the advanced undergradu- 
ate or beginning graduate level. This material will be interesting and useful for computer 
scientists, statisticians, engineers, and physicists, as well as mathematicians. 

Part II of the text contains an introduction to algebraic combinatorics, discussing groups, 
group actions, permutation statistics, tableaux, symmetric polynomials, and formal power 
series. My presentation of symmetric polynomials is more combinatorial (and, I hope, more 
accessible) than the standard reference work [84]. In particular, a novel approach based 
on antisymmetric polynomials and abaci yields elementary combinatorial proofs of some 
advanced results such as the Pieri Rules and the Littlewood—Richardson Rule for multiply- 
ing Schur symmetric polynomials. Part II assumes a bit more mathematical sophistication 
on the reader’s part (mainly some knowledge of linear algebra) and could be used for a 
one-semester course for graduate students in mathematics and related areas. Some relevant 
background material from abstract algebra and linear algebra is reviewed in an appendix. 
The final chapter consists of independent sections on optional topics that complement ma- 
terial in the main text. In many chapters, some of the harder material in later sections can 
be omitted without loss of continuity. 

Compared to the first edition, this new edition has an earlier, expanded treatment of 
generating functions that focuses more on the combinatorics and applications of generating 
functions and less on the algebraic formalism of formal power series. In particular, we provide 
greater coverage of exponential generating functions and the use of generating functions to 
solve recursions, evaluate summations, and enumerate complex combinatorial structures. 
We cover successor algorithms in more detail in Chapter 6, providing automatic methods to 
create these algorithms directly from counting arguments based on the Sum and Product 
Rules. The final chapter contains some new material on quasisymmetric polynomials. Many 
chapters in Part I have been reorganized to start with elementary content most pertinent 
to solving applied problems, deferring formal proofs and advanced material until later. I 
hope this restructuring makes the second edition more readable and appealing than the 
first edition, without sacrificing mathematical rigor. 
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xviii Preface to the Second Edition 


Each chapter ends with a summary, a set of exercises, and bibliographic notes. The 
book contains over 1200 exercises, ranging in difficulty from routine verifications to unsolved 
problems. Although we provide references to the literature for some of the major theorems 
and harder problems, no attempt has been made to pinpoint the original source for every 
result appearing in the text and exercises. 

I am grateful to the editors, reviewers, and other staff at CRC Press for their help 
with the preparation of this second edition. Readers may communicate errors and other 
comments to the author by sending e-mail to nloehr@vt. edu. 


Nicholas A. Loehr 


Introduction 


The goal of enumerative combinatorics is to count the number of objects in a given finite 
set. This may seem like a simple task, but the sets we want to count are often very large 
and complicated. Here are some examples of enumeration problems that can be solved using 
the techniques in this book. How many encryption keys are available using the 128-bit AES 
encryption algorithm? How many strands of DNA can be built using five copies of each of 
the nucleotides adenine, cytosine, guanine, and thymine? How many ways can we be dealt 
a full house in five-card poker? How many ways can we place five rooks on a chessboard 
with no two rooks in the same row or column? How many subsets of {1,2,...,} contain no 
two consecutive integers? How many ways can we write 100 as a sum of positive integers? 
How many connected graphs on n vertices have no cycles? How many integers between 
1 and n are relatively prime to n? How many circular necklaces can be made with three 
rubies, two emeralds, and one diamond if all rotations of a given necklace are considered 
the same? How many ways can we tile a chessboard with dominos? The answers to such 
counting questions can help us solve a wide variety of problems in probability, cryptography, 
algorithm analysis, physics, abstract algebra, and other areas of mathematics. 

Part I of this book develops the basic principles of counting, placing particular emphasis 
on the role of bijections. To give a bijective proof that a given set S has size n, one must 
construct an explicit one-to-one correspondence (bijection) from S' onto the set {1,2,...,m}. 
More generally, one can prove that two sets A and B have the same size by exhibiting a 
bijection between A and B. For example, given a fixed positive integer n, let A be the set 
of all strings w1w2--- Wwe, consisting of n left parentheses and n right parentheses that are 
balanced (every left parenthesis can be matched to a right parenthesis later in the sequence). 
Let B be the set of all arrays 

Yr Y2 "7 Yn 
41 2Q °°" an 


such that every number in {1,2,...,2n} appears once in the array, yi; < y2 <-+*+ < Yn; 21 < 
2g. <-+++< 2n, and y; < 2; for every 7. The sets A and B seem quite different at first glance. 
Yet, we can demonstrate that A and B have the same size using the following bijection. 
Given w = Ww ,W2-++ Wen in A, let y1, y2,---,;Yn be the positions of the left parentheses in w 
(written in increasing order), and let 21, 22,...,2n be the positions of the right parentheses 
in w (written in increasing order). For example, the string (() 0) ((())) © in A maps to 


the array 
124 7 8 9 18 
3.5 6 10 11 12 14 |" 


One may check that the requirement y; < z; for all 7 is equivalent to the fact that w is 
a balanced string of parentheses. The string w is uniquely determined by the array of y;’s 
and z;’s, and every such array arises from some string w in A. Thus we have defined the 
required one-to-one correspondence between A and B. We now know that the sets A and 
B have the same size, although we have not yet determined what that size is! 

Bijective proofs, while elegant, can be very difficult to discover. For example, let C be 
the set of rearrangements of 1,2,...,7 that have no decreasing subsequence of length three. 
It turns out that the sets B and C have the same size, so there must exist a bijection from 
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B to C. Can you find one? (Before spending too long on this question, you might want to 
read §12.13.) 

Luckily, the field of enumerative combinatorics contains a whole arsenal of techniques to 
help us solve complicated counting problems. Besides bijections, some of these techniques 
include recursions, generating functions, group actions, inclusion-exclusion formulas, linear 
algebra, probabilistic methods, and symmetric polynomials. In the rest of this introduction, 
we describe several challenging enumeration problems that can be solved using these more 
advanced methods. These problems, and the combinatorial technology needed to solve them, 
will be discussed at greater length later in the text. 


Standard Tableaux 


Suppose we are given a diagram D consisting of a number of rows of boxes, left-justified, 
with each row no longer than the one above it. For example, consider this diagram: 


Let n be the total number of boxes in the diagram. A standard tableau of shape D is a filling 
of the boxes in D with the numbers 1, 2,..., (used once each) so that every row forms an 
increasing sequence (reading left to right), and every column forms an increasing sequence 
(reading top to bottom). For example, here are three standard tableaux of shape D, where 
D is the diagram pictured above: 


Question: Given a diagram D of n cells, how many standard tableaux of shape D are there? 

There is a remarkable answer to this counting problem, known as the Hook-Length 
Formula. To state it, we need to define hooks and hook lengths. The hook of a box b in a 
diagram D consists of all boxes to the right of b in its row, all boxes below 0 in its column, 
and box 6 itself. The hook length of b, denoted h(b), is the number of boxes in the hook of 
b. For example, if 6 is the first box in the second row of D, then the hook of b consists of 
the marked boxes in the following picture: 


Hook-Length Formula: Given a diagram D of n cells, the number of standard tableaux 
of shape D is n! divided by the product of the hook lengths of all the bores in D. 
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For the diagram D in our example, the formula says there are exactly 


g! 
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standard tableaux of shape D. Observe that the set B of 2 x n arrays (discussed above) can 
also be enumerated with the aid of the Hook-Length Formula. In this case, the diagram D 
consists of two rows of length n. The hook lengths for boxes in the top row are n+ 1, n, 
n—1,..., 2, while the hook lengths in the bottom row are n,n —1,...,1. Since there are 
2n boxes in D, the Hook-Length Formula asserts that 


B (2n)! (2n)! 

a= (nt+1)-n-(n—-1)-...:2-n-(n-1)-...51 0 (n+1)'nl 
The fraction on the right side is an integer called the nth Catalan number. Since we previ- 
ously displayed a bijection between B and A (the set of strings of balanced parentheses), 
we conclude that the size of A is also given by a Catalan number. As we will see, many 
different types of combinatorial structures are counted by the Catalan numbers. 

How is the Hook-Length Formula proved? Many proofs of this formula have been found 
since it was originally discovered in 1954. There are algebraic proofs, probabilistic proofs, 
combinatorial proofs, and (relatively recently) bijective proofs of this formula. Here we 
discuss a flawed probabilistic argument that gives a little intuition for how the mysterious 
Hook-Length Formula arises. Suppose we choose a random filling F' of the boxes of D with 
the integers 1,2,...,n. What is the probability that this filling will actually be a standard 
tableau? We remark that the filling is standard if and only if for every box b in D, the 
entry in 6 is the smallest number in the hook of b. Since any of the boxes in the hook 
is equally likely to contain the smallest value, we see that the probability of this event is 
1/h(b). Multiplying these probabilities together would give 1/]],<¢p h(0) as the probability 
that the random filling we chose is a standard tableau. Since the total number of possible 
fillings is n! (by the Product Rule, discussed in Chapter 1), this leads us to the formula 
n!/TI,ep h(b) for the number of standard tableaux of shape D. 

Unfortunately, the preceding argument contains a fatal error. The events “the entry 
in box 6 is the smallest in the hook of 6,” for various choices of b, are not necessarily 
independent (see §1.9). Thus we cannot find the probability that all these events occur by 
multiplying together the probabilities of each individual event. Nevertheless, remarkably, 
the final answer obtained by making this erroneous independence assumption turns out 
to be correct! The Hook-Length Formula can be justified by a more subtle probabilistic 
argument due to Greene, Nijenhuis, and Wilf. We describe this argument in §12.12. 
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Rook Placements 


A rook is a chess piece that can travel any number of squares along its current row or column 
in a single move. We say that the rook attacks all the squares in its row and column. How 
many ways can we place eight rooks on an ordinary 8 x 8 chessboard so that no two rooks 
attack one another? The answer is 8! = 40, 320. More generally, we can show that there are 
n! ways to place n non-attacking rooks on an n x n chessboard. To see this, first note that 
there must be exactly one rook in each of the n rows. The rook in the top row can occupy 
any of the n columns. The rook in the next row can occupy any of the n — 1 columns not 
attacked by the first rook; then there are n — 2 available columns for the next rook, and 
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so on. By the Product Rule (discussed in Chapter 1), the total number of placements is 
therefore n x (n—1) x (n—2)x---xLl=nl. 

Now consider an (n + 1) x (n +1) chessboard with a bishop occupying the upper-left 
corner square. (A bishop is a chess piece that attacks all squares that can be reached from 
its current square by moving in a straight line northeast, northwest, southeast, or southwest 
along a diagonal of the chessboard.) Question: How many ways can we place n rooks on 
this chessboard so that no two pieces attack one another? An example of such a placement 
on a standard chessboard (n + 1 = 8, so n = 7) is shown below: 
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It turns out that the number of non-attacking placements is the closest integer to n!/e. Here, 
e is the famous constant e = )>7°.9 1/k! © 2.718281828 that appears throughout the subject 
of calculus. When n = 7, the number of placements is 1854 (note 7!/e = 1854.112...). 

This answer follows from the Inclusion-Exclusion Formulas to be discussed in Chapter 4. 
We sketch the derivation now to indicate how the number e appears. First, there are n! 
ways to place the n rooks on the board so that no two rooks attack each other, and no rook 
occupies the top row or the leftmost column (lest a rook attack the bishop). However, we 
have counted many configurations in which one or more rooks occupy the diagonal attacked 
by the bishop. To correct for this, we will subtract a term that accounts for configurations of 
this kind. We can build such a configuration by placing a rook in row 7, column 7, for some 
i between 2 and n+ 1, and then placing the remaining rooks in different rows and columns 
in (n — 1)! ways. So, presumably, we should subtract n x (n — 1)! = n! from our original 
count of n!. But now our answer is zero! The trouble is that our subtracted term over-counts 
those configurations in which two or more rooks are attacked by the bishop. A naive count 
leads to the conclusion that there are nD (n — 2)! = n!/2! such configurations, but this 
figure over-counts configurations with three or more rooks on the main diagonal. Thus we 
are led to a formula (called an Inclusion-Exclusion Formula) in which we alternately add 
and subtract various terms to correct for all the over-counting. In the present situation, the 
final answer turns out to be 


n! —n! +n! /2! — nl/3! + n!/4! nl/5!-+--+ (1)"nl/nl = nl (1). 
k=0 


Next, recall from calculus that e? = aaa x* /k! for all real x. In particular, taking 2 = —1, 
we have 
1 co 
et == =1-141/2!-1/3!4+1/41— 1/5!+--- = S0(-1)*/k. 
e€ 
k=0 


We see that the combinatorial formula stated above consists of the first n + 1 terms in 
the infinite series for n!/e. It can be shown (see §4.5) that the “tail” of this series, namely 
Wrens (-1)*n!/k!, is always less than 0.5 in absolute value. Thus, rounding n!/e to the 
nearest integer will produce the required answer. 
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Another interesting combinatorial problem arises by comparing non-attacking rook 
placements on two boards of different shapes. For instance, consider the two generalized 
chessboards shown here: 


One can check that for every k > 1, the number of ways to place k non-attacking rooks 
on the first board is the same as the number of ways to place k non-attacking rooks on the 
second board. We say that two boards are rook-equivalent whenever this property holds. It 
turns out that an n x n board is always rook-equivalent to a board with successive row 
lengths 2n — 1,2n — 3,...,5,3,1. More generally, there is a simple criterion for deciding 
whether two boards “of partition shape” are rook-equivalent. We will present this criterion 
in §12.3. 


(MR 
Tilings 


Now we turn to yet another problem involving chessboards. A domino is a rectangular object 
that can cover two horizontally or vertically adjacent squares on a chessboard. A tiling of a 
board is a covering of the board with dominos such that each square is covered by exactly 
one domino. For example, here is one possible tiling of a standard 8 x 8 chessboard: 


Question: Given a board of dimensions mx n, how many ways can we tile it with dominos? 
This question may seem unfathomably difficult, so let us first consider the special case where 
m = 2. In this case, we are tiling a 2 x n region with dominos. Let f, be the number of 
such tilings, for n = 0,1,2,.... One can see by drawing pictures that 


fo fi 1, fe 2, fs 3, fa 5, fs 8, fe LB cues 


The reader may recognize these numbers as being the start of the famous Fibonacci sequence. 
This sequence is defined recursively by letting Fo = F, = 1 and F, = Fy_-1 + Fy-2 for all 
n > 2. Now, a routine counting argument can be used to prove that the tiling numbers f/f, 
satisfy the same recursive formula fn = fn—1 + fn—2. To see this, note that a 2 x n tiling 
either ends with one vertical domino or two stacked horizontal dominos. Removing this part 
of the tiling either leaves a 2 x (n— 1) tiling counted by f,-1 or a 2 x (n — 2) tiling counted 
by fn—2. Since the sequences (f;,) and (F;,) satisfy the same recursion and initial conditions, 
we must have f, = F;, for all n. 

Now, what about the original tiling problem? Since the area of a tiled board must be 
even, there are no tilings unless at least one of the dimensions of the board is even. For 
boards satisfying this condition, Kasteleyn, Temperley, and Fisher proved the following 
amazing result. The number of domino tilings of an m x n chessboard (with m even) is 
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exactly equal to 
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j=l k=1 
The formula is especially striking since the individual factors in the product are transcen- 
dental numbers, yet the product of all these factors is a positive integer! When m = n = 8, 
the formula reveals that the number of domino tilings of a standard chessboard is 12,988,816. 
The proof of the formula involves Pfaffians, which are quantities analogous to determinants 
that arise in the study of skew-symmetric matrices. For details, see §12.15 and §12.16. 


Notes 


Different proofs of the Hook-Length Formula may be found in [37, 40, 55, 96, 103]. Treat- 
ments of various aspects of rook theory appear in [36, 48, 49, 69]. The Domino Tiling 
Formula was proved by Kasteleyn in [70] and discovered independently by Fisher and Tem- 
perley [31]. 
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Basic Counting 


This chapter develops the basic counting techniques that form the foundation of enumerative 
combinatorics. We apply these techniques to study fundamental combinatorial objects such 
as words, permutations, subsets, functions, and lattice paths. We also give some applications 
of combinatorics to probability theory. 
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1.1 The Product Rule 


We begin with the Product Rule, which gives us a way to count objects that can be built 
up from smaller ingredients by making a sequence of choices. 


1.1. The Product Rule. Suppose S is a set of objects such that every object in S can 
be constructed in exactly one way by making a sequence of k choices, where & is a fixed 
positive integer. Suppose the first choice can be made in n, ways; the second choice can be 
made in n2 ways, no matter what the first choice was; and so on. In general, for 1 <i<k, 
suppose the 7th choice can be made in n; ways, no matter what the previous choices were. 
Then the total number of objects in S is nyn2---+nz, the product of the number of choices 
possible at each stage. 


We prove the Product Rule later (see §1.15). When using this rule to count objects, the 
key question to ask is: how can I build the objects I want by making a sequence of choices? 
The following examples illustrate how this works. 


1.2. Example: License Plates. A Virginia license plate consists of three uppercase letters 
followed by four digits. How many license plates are possible? To answer this with the 
Product Rule, we build a typical license plate by making a sequence of seven choices. First, 
choose the leftmost letter; there are ny = 26 ways to make this choice. Second, choose 
the next letter in any of ng = 26 ways. Third, choose the next letter in any of n3 = 26 
ways. Fourth, choose the first digit in any of n4 = 10 ways. Continue similarly; we have 
Ns = Ng = N7 = 10. By the Product Rule, the total number of license plates is 


NyNg+-+n7 = 26° - 104 = 175,760,000. 


1.3. Example: Numbers. How many odd four-digit numbers have no repeated digits and 
do not contain the digit 8? For example, 3461 and 1705 are valid numbers, but 3189 and 
1021 are not. We can build a number of the required type by choosing its digits one at a 
time in the following order. First, choose the rightmost digit. Since we want an odd number, 
there are n; = 5 possibilities for this digit (1 or 3 or 5 or 7 or 9). Second, choose the leftmost 
digit. Here there are nz = 7 possibilities: of the ten digits 0 through 9, we must avoid 0, 
8, and the digit chosen in the first stage. Third, choose the second digit from the left. Now 
there are n3 = 7 possibilities, since 0 is available, but we must avoid 8 and the two different 
digits chosen in the first two stages. Fourth, choose the third digit from the left; there are 
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n4 = 6 possibilities (avoid 8 and the three different digits chosen previously). The object is 
now complete, so the Product Rule tells us there are 


nynengng =5-7-7-6 = 1470 


numbers satisfying the given conditions. 

In this example, we cannot arrive at the answer by choosing digits from left to right. 
Using this approach, we would find n; = 8 (avoid 0 and 8), ng = 8 (avoid 8 and the first 
digit), and ng = 7 (avoid 8 and the first two digits). But when we try to choose the last 
digit, what is n4? The last digit needs to be odd and different from the three previously 
chosen digits. The number of available choices at the fourth stage depends on how many 
of the previous digits are odd. So n4 depends on the particular choices made earlier, which 
violates the setup for the Product Rule. Thus, that rule does not apply to the construction 
method attempted here. In our solution above, we avoided this difficulty by choosing the 
rightmost digit first. 


1.4. Example: Poker Hands. A poker hand is a set of five different cards from a 52-card 
deck. Each card in the deck has a suit (clubs, diamonds, hearts, or spades) and a value (2 
through 10, jack, queen, king, or ace). For example, H = {40, 9%, AO, 99, J@} is a poker 
hand. How many poker hands are there? It might seem we could build a poker hand by 
choosing the first card (n; = 52), then a second card different from the first (nz = 51), then 
the third card (n3 = 50), then the fourth card (n4 = 49), then the fifth card (n5 = 48). 
Using the Product Rule would give njngn3n4n5 = 311,875,200 possible hands. However, 
this argument is flawed because, by definition, a poker hand is an unordered set of cards. 
Thus we cannot speak of the “first” card in the hand. Another way to explain the error is to 
notice that the same hand can be constructed in many different ways by making different 
sequences of choices; but the Product Rule demands that each object to be counted must 
arise from one and only one sequence of choices. For example, the hand H shown above 
could be built by first choosing 49, then 9%, then AQ, then 99, then J@. But, we could 
build the same hand by first choosing 99, then J@, then 9%, then 49, then A. By the 
Product Rule, we see that there are 5-4-3-2-1 = 120 different orderings of these five cards 
that would all produce the same (unordered) hand H. In fact, every poker hand (not just 
H) is constructed 120 times by the choice process described above. So we can obtain the 
correct count by dividing the initial answer by 120, getting N = 2,598,960 poker hands. 
The Subset Rule (see 1.27 below) generalizes this result. 


1.5. Example: Straight Poker Hands. A straight poker hand consists of five cards with 
consecutive values, where the ace can be treated as either a low card or a high card. For 
example, {AQ 2%, 30,40, 5de} and { kde, 9d, J&, Qe, 10&} are straight poker hands (note 
that the order of cards within the hand is irrelevant). How many straight poker hands are 
there? We can build all such hands by making the following choices: 


1. Choose the lowest value in the hand. There are n; = 10 possible values here (ace, 2, ..., 
10). Since the hand is a straight, all values in the hand are determined by this choice. 
If we choose 5 (say), then the partially constructed hand looks like {5-,6—,7—,8-,9-}. 


2. Choose the suit for the lowest value in the hand; there are nz = 4 suits we could pick. In 
our sample object, suppose we pick spades; the object now looks like {5@,6—,7-,8-,9-}. 


3. Choose the suit for the second lowest value in the hand; there are n3 = 4 possibilities. 
Picking 9 for our sample object, we now have {5@,60,,7-,8-,9-}. 


4. Choose the suit for the third lowest value in the hand, so n4 = 4. Perhaps we pick hearts 
again; the partial hand is now {5@,69,79,8-,9-}. 
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5. Choose the suit for the fourth lowest value in the hand, so ns = 4. Say we pick spades; 
our object is now {5@, 69, 79, 8@, 9-}. 


6. Choose the suit for the highest value in the hand, so ng = 4. If we pick diamonds, we 
obtain the complete object {5@, 69,79, 8@, 90}. 


According to the Product Rule, there are 10- 4° = 10,240 straight poker hands (which 
include straight flush hands). Now, by definition, the probability of a straight poker hand 
is the number of such hands divided by N, the total number of five-card poker hands from 
Example 1.4. This probability is approximately 0.00394. (We discuss probability in more 
detail in §1.7.) 

In this example, we needed to think of a clever sequence of choices to build the straight 
poker hands that took advantage of the special structure of these hands. The more direct 
approach of choosing the cards in the hand one at a time might not work. For instance, 
if we choose an arbitrary card at stage 1 (rather than the low card), then the number of 
choices for cards in later stages depends on which choices were made earlier. 


1.6. Example: Rook Placements. How many ways can we place four non-attacking 
rooks on the board shown below? (Recall that non-attacking means no two rooks are in the 
same row or column.) 


Evidently, each of the four rows must contain a rook. First, choose a square for the rook 
in the lowest row (n; = 2 ways). Second, choose a square for the rook in the next lowest 
row; there are nz = 2 possible squares, since we must avoid the column of the rook already 
placed. Third, choose a square for the rook in the next lowest row; there are nz = 5-2 = 3 
choices here. Finally, place the rook in the highest row (n4 = 8 —3 = 5 ways). The Product 
Rule gives njngngn4 = 2-2-3-5 = 60 rook placements. If we tried to place rooks starting in 
the top row and working down, the Product Rule could not be used since nz (for instance) 
would depend on what choice was made in the top row. 


DT 


1.2 The Sum Rule 


The next counting rule, called the Sum Rule, allows us to count a complicated set of 
objects by breaking the set into a collection of smaller, simpler subsets that can be counted 
separately. 


1.7. The Sum Rule. Suppose S' is a set of objects such that every object in S belongs 
to exactly one of k non-overlapping categories, where k is a fixed positive integer. Suppose 
the first category contains m, objects, the second category contains m2 objects, and so on. 
Then the total number of objects in S is mj + m2 +---+ mg, the sum of the number of 
objects in each category. 


When using the Sum Rule, we must create the appropriate categories (subsets of S) that 
can be counted by other means. Often we design the categories so that each category can 
be counted using the Product Rule. It is crucial that each object in the original set belong 
to one and only one of the smaller categories, and that all of the categories be subsets of 
the original collection S. 
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1.8. Example: Fraternity Names. The name of a fraternity is a sequence of two or 
three uppercase Greek letters. How many possible fraternity names are there? (There are 
24 letters in the Greek alphabet.) To solve this, we divide the set S of all such names into 
two categories, where category 1 consists of the two-letter names, and category 2 consists 
of the three-letter names. We can build objects in category 1 by choosing the first letter 
in ny = 24 ways, and then choosing the second letter in ng = 24 ways. By the Product 
Rule, category 1 has size nyng = 247 = 576. Similarly, category 2 has size 243 = 13824. By 
the Sum Rule, the total number of names is 576 + 13824 = 14400. Note that when using 
the Product Rule, the number of stages k in the choice sequence must be the same for all 
objects in the set being counted. For the original set S, we needed k = 2 stages for the 
two-letter names, and k = 3 stages for the three-letter names. This is why the Sum Rule 
was used in this example (but see Exercise 1-4). 


Here and below, we write |.S'| to denote the number of elements in a finite set S. 


1.9. Example: Numbers. How many even four-digit numbers have no repeated digits and 
do not contain the digit 8? Write such a number as abcd where a, b,c, d are digits between 0 
and 9. If we try to imitate the argument used in Example 1.3, choosing d, a, b, and c in this 
order, we find that the number of choices for a depends on whether d is zero. This suggests 
dividing the set S of objects being counted into two categories, where category 1 consists of 
numbers in S' ending in zero, and category 2 consists of numbers in S not ending in zero. To 
build an object in category 1, choose d = 0 (n; = 1 way), then choose a (ng = 8 ways, since 
we must avoid 0 and 8), then choose b (n3 = 7 ways, since we must avoid a and d and 8), 
then choose c (n4 = 6 ways). The Product Rule shows that there are m; = 1-8-7-6 = 336 
objects in category 1. To build an object in category 2, choose d (n, = 3 ways, since d can 
be 2 or 4 or 6), then choose a (ng = 7 ways, since we must avoid 0 and 8 and d), then 
choose b (ng = 7 ways, since we must avoid 8 and d and a), then choose c (n4 = 6 ways). 
By the Product Rule, there are mz = 3-7- 7-6 = 882 objects in category 2. By the Sum 
Rule, S has size 336 + 882 = 1218. 

Here is another solution to the same problem. Let T' be the set of all four-digit numbers 
with distinct digits unequal to 8. On one hand, an argument using the Product Rule shows 
that T has size 8- 8-7-6 = 2688. On the other hand, we can write T as the union of two 
categories: S' (the set of even numbers in T) and U (defined to be the set of odd numbers in 
T). In Example 1.3, we used the Product Rule to compute |U| = 1470. On the other hand, 
the Sum Rule tells us that |7'] = |.$|+ |U]|. So |S| = |T| — |U| = 2688 — 1470 = 1218. 


1.10. Example: Rook Placements. How many ways can we place three non-attacking 
rooks on the board shown here? 


Let S be the set of all such rook placements. Since there are three rooks and four rows, 
every object in S has exactly one empty row. This suggests creating four categories $1, S2, 
S3, and S4, where S; consists of those placements in S' where the ith row from the top is 
empty. By the Sum Rule, |S] = |.$1| + |S2| + |.S3| + |S]. Each set 5; can be counted using 
the Product Rule. For example, to build an object in S3, place a rook in the lowest row 
(n1 = 2 ways), then place a rook in the second row from the top (ng = 5— 1 = 4 ways), 
then place a rook in the top row (n3 = 8— 2 = 6 ways). Thus, |.S3| = 2-4-6 = 48. Similarly, 
we find that |.S;| = 2-2-3 = 12, |S:| = 2-2-6 = 24, and |S4| = 3-4-6 = 72. So the total 
number of placements is 12 + 24+ 48 + 72 = 156. 
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1.11. Example: Words. How many five-letter words (sequences of uppercase letters) have 
the property that any Q appearing in the word is immediately followed by U? To solve this 
problem with the Sum and Product Rules, we need to divide the given set of words into 
appropriate categories. One approach is to create templates showing exactly which positions 
in the word contain a Q. In the following templates, a star denotes a letter different from 
. 1. #eKKK 2. QUe** 3. *QUE* 4. **QU* 

5. ***QU 6. QUQU* 7. QU*QU 8. *QUQU 


For 1 <7 < 8, let $; be the set of words matching the ith template. Since there are 25 choices 
for each starred position, the Product Rule gives |.$| = 25°, |S2| = |53| = |S4| = |.Ss| = 25%, 
and |S¢| = |.$7| = |Sg| = 25. Then the Sum Rule gives the answer 25° + 4-259 +3-25 = 
9,828,200 words. 


Beginners sometimes get confused about whether to use the Sum Rule or the Product 
Rule. Remember that the Product Rule is used when we are building each object being 
counted via a sequence of choices. The Sum Rule is used when we are subdividing the 
objects being counted into smaller, non-overlapping categories. 


DT 


1.3 Counting Words and Permutations 


In the next few sections, we apply the Sum and Product Rules to count basic combinatorial 
structures such as words, permutations, subsets, etc. These objects often appear as building 
blocks used to solve more complicated counting problems. 


1.12. Definition: Words. Let A be a finite set. A word in the alphabet A is a sequence 
WwW = W1W2-+: Wr, where each w; € A and k > 0. The length of w = wyw2--- wz, is k. Two 
words w = wyw2:-+ Ww, and z= 2122-++ 2m are equal iff! k= m and w; = z% for 1 <i<k. 


1.13. Example. Let A = {a,b,c,...,z} be the set of 26 lowercase letters in the English 
alphabet. Then stop, opts, and stoops are distinct words (of lengths 4, 4, and 6, respectively). 
If A = {0,1}, the 8 words of length 3 in the alphabet A are 


000, 001, 010, 011, 100, 101, 110, 111. 


There is exactly one word of length zero, called the empty word. It is sometimes denoted 
by the special symbols - or e. 


1.14. The Word Rule. If A is an n-letter alphabet and k > 0, then there are n* words 
of length & over A. 


Proof. For fixed k > 0, we can uniquely construct a typical word w = w ,w2--- wx of length 
k by a sequence of k choices. First, choose w; € A to be any of the n letters in A. Second, 
choose wz € A in any of n ways. Continue similarly, choosing w; € A in any of n ways for 
1<i<k. By the Product Rule, the number of words is n x n x --- x n (k factors), which 
is n®. Note that the empty word is the unique word of length 0 in the alphabet A, so our 
formula holds for k = 0 also. O 


1.15. Definition: Permutations. Let A be an n-element set. A permutation of A is a 
word w = w W2-:-W, in which each letter of A appears exactly once. Permutations are 
also called rearrangements or linear orderings of A. 


1Here and below, iff is an abbreviation for if and only if. 
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1.16. Example. The six permutations of A = {x,y,z} are 
LYZ, LZY, YLZ, Yeu, wry, zyx. 
Some linear orderings of {1,2,3,4,5} are 35142, 54321, 24513, and 12345. 
1.17. Definition: Factorials. For each integer n > 1, n-factorial is 
n!=nx (n—-1)-(n—2)-...-3-2-1, 
which is the product of the first n positive integers. We also define 0! = 1. 
1.18. The Permutation Rule. There are n! permutations of an n-letter alphabet A. 


Proof. Build a typical permutation w = w1w2-:- Wp of A by making n choices. First, choose 
wy, to be any of the n letters of A. Second, choose we to be any of the n — 1 letters of A 
different from w ,. Third, choose w3 to be any of the n — 2 letters of A different from wy, 
and wz. Proceed similarly; at the nth stage, choose w,, to be the unique letter of A that 
is different from w1,we2,...,Wn—1. By the Product Rule, the number of permutations is 
nx (n-—1)x---x 1=nl!. The result also holds when n = 0. Oo 


Sometimes we want to count partial permutations in which not every letter of the al- 
phabet gets used. 


1.19. Definition: k-Permutations. Let A be an n-element set. A k-permutation of A is 
a word w = w,w2--- wp consisting of k distinct letters in A. An n-permutation of A is the 
same as a permutation of A. 


1.20. Example. The twelve 2-permutations of A = {a, b,c, d} are 
ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, de. 


1.21. The Partial Permutation Rule. Suppose A is an n-letter alphabet. For 0 <k <n, 
the number of k-permutations of A is 


n! 


ne ar 


For k > n, there are no k-permutations of A. 


Proof. Build a typical k-permutation w = wyw2---w, of A by making k choices. First, 
choose w, to be any of the n letters of A. Second, choose w2 to be any of the n — 1 letters 
of A different from w;. Continue similarly. When we choose w; (where 1 <i < k), we have 
already used the i — 1 distinct letters w1,w2,...,wj—1. Since A has n letters, there are 
n—(i—1)=n—i-+1 choices available at stage i. In particular, for the kth and final choice, 
there aren —k+1 ways to choose wz. By the Product Rule, the number of k-permutations 
is [Loa (i—1)) = n(n—-1)---(n—k+1). Multiplying this expression by (n—k)!/(n—k)!, 
we obtain the product of the integers 1 through n in the numerator, which is n!. Thus the 
answer is also given by the formula n!/(n — k)!. Oo 


1.22. Example. How many six-letter words: (a) begin with a consonant and end with a 
vowel; (b) have no repeated letters; (c) have no two consecutive letters equal; (d) have no 
two consecutive vowels? Assume we are using the alphabet {A,B,...,Z} with vowels A, E, 
I, O, and U. 

For (a), we build a word by choosing the consonant at the beginning (n, = 21 ways), 
then choosing the next four letters (nz = 264 ways, by the Word Rule), then choosing the 
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vowel at the end (nz = 5 ways). By the Product Rule, the answer to (a) is 21 - 264-5. For 
(b), we use the Partial Permutation Rule with n = 26 and k = 6 to get 26!/20! words. For 
(c), we choose the letters from left to right, getting ny = 26 and no = ng = ++: = ng = 25. 
Note that when we choose the ith letter with 7 > 1, we only need to avoid the one letter 
immediately preceding it in the word built so far. The answer to (c) is 26 - 25°. 

For (d), we use the Sum Rule, dividing the words we want into many smaller cate- 
gories. We label each category with a template showing the possible positions of vowels 
and consonants in the words in that category. For example, the template CVCCVC labels 
the category of words with vowels in positions 2 and 5 and consonants elsewhere. By the 
Product Rule, this category has size 214 - 5? (and similarly for any other template with 
exactly two V’s). There is one template CCCCCC with no vowels, which stands for a set 
of 21° words. There are six templates with one vowel, each of which stands for a set of 
21° -5 words. By explicitly listing them, we find there are ten templates with two vowels: 
VCVCCC, VCCVCC, VCCCVC, VCCCCV, CVCVCC, CVCCVC, CVCCCV, CCVCVC, 
CCVCCV, and CCCVCV. There are two templates with three vowels, namely VCVCVC 
and CVCVCV, each of which stands for a set of 21° - 5° words. Combining all these counts 
with the Sum Rule, the answer to (d) is 


21° + 6-215-5410- 214.5? 42-213. 5% = 259, 224, 651. 


1.23. Example. Three married couples and a single guy are standing in line at a bank. 
How many ways can the line be formed if: (a) the three women are at the front of the line; 
(b) men and women alternate in the line; (c) each husband stands adjacent to his wife in 
the line; (d) the single guy is not adjacent to a woman? 

We solve these questions by combining the Sum, Product, and Permutation Rules. For 
(a), we build the line by first choosing a linear ordering of the three women (n; = 3! = 6 
ways) and then choosing a permutation of the four men behind them (ng = 4! = 24 ways); 
the Product Rule gives 6 - 24 = 144 possible lines. 

For (b), note that the men must occupy positions 1, 3, 5, and 7 in the line, and the 
women must occupy positions 2, 4, and 6. To build such a line, first choose a permutation 
of the men (n1 = 4! = 24 ways) and then place the men (in this order) in the odd-numbered 
positions of the line. Next choose a permutation of the women (nz = 3! = 6 ways) and then 
place the women (in this order) in the even-numbered positions of the line. As in (a), the 
total count is 24-6 = 144. 

For (c), construct the line by the following choices. First, choose a permutation of the 
four objects consisting of the three married couples and the single guy (n; = 4! = 24 ways). 
Second, for the married couple closest to the front of the line, choose whether the man or 
woman comes first (ng = 2 ways). Third, for the next married couple, choose who comes 
first (n3 = 2 ways). Fourth, for the last married couple, choose who comes first (ng = 2 
ways). The answer is 24-23 = 192. If all the married couples stood side by side (rather than 
adjacent in line), the answer would be 24. 

For (d), we use the Sum Rule. Divide the lines being counted into three categories. 
Category 1 consists of lines with the single guy first; category 2 consists of lines with the 
single guy last; and category 3 consists of lines with the single guy somewhere in the middle. 
To build a line in category 1, put the single guy at the front (1 way); then choose a man to 
stand behind him (3 ways); then permute the remaining five people (5! = 120 ways). This 
gives 1-3-120 = 360 objects in category 1. Similarly, category 2 has 360 objects. To build a 
line in category 3, put the single guy somewhere in the middle (5 ways); then choose a man 
to stand in front of him (3 ways); then choose a different man to stand behind him (2 ways); 
then permute the remaining four people and put them in the remaining positions in the line 
from front to back (4! = 24 ways). By the Product Rule, category 3 has 5-3-2-24 = 720 
objects. The answer to (d) is 360 + 360 + 720 = 1440. 
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1.4 Counting Subsets 


In this section, we count subsets of a given finite set S. Recall that T C S' means that T is 
a subset of S, i.e., for all objects x, if « € T then x € S. Sets S and T are equal iff T C S 
and S C T, which means that S' and T have exactly the same members. 


1.24. Definition: Power Set. For any set S, the power set P(S) is the set of all subsets 
of S. Thus, T € P(S) iff TCS. 


1.25. Example. If S = {2,5,7}, then P(S) is the eight-element set 
{0, {2}, {5}, {7}, {2, 5}, {2, 7}, {5, 7}, {2,5, TP}. 


1.26. The Power Set Rule. An n-element set has 2” subsets. In other words, if |.S| =n, 
then |P(S)| = 2”. 


Proof. Suppose S = {x1,...,2%n} is an n-element set. We can build a typical subset T of 
S by making a sequence of n choices. First, decide whether x1 is or is not a member of T;; 
there are n, = 2 possible choices here. Second, decide whether x2 € T or xq ¢ T; again 
there are two possibilities. Continue similarly; decide in the ith choice whether x; € T or 
x; ¢ T (ni = 2 possible choices). This sequence of choices determines which z,’s belong 
to T. Since T is a subset of S, this information uniquely characterizes the set T. By the 
Product Rule, the number of subsets is 2 x 2 x --- x 2 (n factors), which is 2”. Oo 


To illustrate this proof, suppose S = {2,5,7} and 21 = 2, x2 = 5, and 23 = 7. When 
building a subset T’, we might make these choices: 2 is in T; 5 is not in T; 7 is in T. This 
choice sequence creates the subset T = {2,7}. Another choice sequence is: 2 ¢ T; 5 ¢ T; 
7 ¢T. This choice sequence constructs the empty set @. 

Next we consider the enumeration of subsets of an n-element set having a fixed size k. 
For example, there are ten 3-element subsets of {a, b,c, d, e}: 


{a, b, c}, {a, b, Gh. {a, be}, {a,¢, d}, {a,¢, é@}; 


{a,d,e}, {b,c,d}, {b,c,e}, {b,d,e}, {c,d,e}. 

In this example, we present a given set by listing its members between curly braces. This 
notation forces us to list the members of each set in a particular order (alphabetical in 
this case). If we reorder the members of the list, the underlying set does not change. For 
example, the sets {a, c,d} and {c, d,a} and {d,c, a} are all the same set. Similarly, listing an 
element more than once does not change the set: {a, c,d} and {a, a,c,c,c,d} and {d, d, a, c} 
are all the same set. These assertions follow from the very definition of set equality: A = B 
means that for every x, x € A iff x € B. The sets mentioned earlier in this paragraph have 
members a and c and d and nothing else, so they are all equal. In contrast, the ordering and 
repetition of elements in a sequence (or word) definitely makes a difference. For instance, 
the words cad, dac, and acadd are unequal although they use the same three letters. 

Suppose we try to count the k-element subsets of a given n-element set using the Product 
Rule. This rule requires us to construct objects by making an ordered sequence of choices. 
We might try to construct a subset by choosing its first element in n ways, then its second 
element in n — 1 ways, etc., which leads to the incorrect answer n(n — 1)---(n —k +1). 
The trouble here is that there is no well-defined first element of a subset. In fact, our 
construction procedure generates each subset several times, once for each possible ordering 
of its members. There are k! such orderings, so we obtain the correct answer by dividing 
the previous formula by k! (cf. Example 1.4). The next proof gives an alternate version of 
this argument that avoids overcounting. 
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1.27. The Subset Rule. For 0 < k < n, the number of k-element subsets of an n-element 


set is 
n! 


k(n —k)! 


Proof. Fix n and k with 0 < k <n. Let A be an n-element set, and let x denote the number 
of k-element subsets of A. Our goal is to show « = n!/(k!(n — k)!). Let S be the set of all 
k-permutations of A. Recall that elements of S are ordered sequences ww --- wz, where the 
w; are distinct elements of A. We compute |S] in two ways. On one hand, |S| = n!/(n —k)! 
by the Partial Permutation Rule. On the other hand, we can build an object in S' by first 
choosing an (unordered) k-element subset of A in any of x ways, and then choosing a 
linear ordering of these & objects in any of k! ways. Every object in S can be constructed 
in exactly one way by this sequence of two choices. By the Product Rule, |S$| = «- k!. 
Comparing the two formulas for |S| and solving for 7, we get 7 = n!/(k!(n—k)!) as needed. 
(This argument is an example of a combinatorial proof, in which we establish an algebraic 
formula by counting a given set of objects in two different ways. We study combinatorial 
proofs in more detail in Chapter 2.) oO 


The Subset Rule is used very frequently in counting arguments, so we introduce the 
following notation for the quotient of factorials appearing in this rule. 


1.28. Definition: Binomial Coefficients. For 0 < k < n, the binomial coefficient is 


n! 


(1) = Cn. ’) = aE 


For k < 0 or k > n, we define () = C(n,k) = 0. Thus, for all n > 0 and all k, (‘) is the 


number of k-element subsets of an n-element set. In particular, ( da is always an integer. 
Here are some counting problems whose solutions use the Subset Rule. 


1.29. Example: Flush Poker Hands. A five-card poker hand is a flush iff every card in 
the hand has the same suit. For example, {30,70,90, J, Q} is a flush. We can build a 
flush by first choosing the common suit for the five cards in n; = 4 ways, and then choosing 
a subset of five cards out of the set of 13 cards having this suit. By the Subset Rule, there 
are No = (=) = 1287 ways to make this second choice. By the Product Rule, the number of 
flush poker hands is 4.1287 = 5148. To find the probability of a flush, we divide this number 
by the total number of poker hands. A poker hand is a 5-element subset of the set of 52 
cards in a deck, so the Subset Rule tells us there are (*7) poker hands (cf. Example 1.4). 
The required probability is approximately 0.00198, about half the probability of getting a 
straight poker hand. For this reason, a flush beats a straight in poker. 


1.30. Example: Full House Poker Hands. A five-card poker hand with three cards 
of one value and two cards of another value is called a full house. For example, 
{40 4d, 4@, 7@, 79} is a full house. The best way to count full house poker hands (and 
hands with similar restrictions on the suits and values) is to construct the hand gradually 
by choosing suits and values rather than choosing individual cards. For instance, we can 
build a full house hand as follows. 


1. Choose the value that will occur three times in the hand in n, = 13 ways. To build 
the sample hand above, we would choose the value 4 here, obtaining the partial hand 
{4-,4—,4—,?,?}. 


2. Choose the value that will occur twice in the hand in ng = 12 ways (since we cannot 
reuse the value chosen first). In our example, we would choose the value 7 here, obtaining 
the partial hand {4—,4—,4—,7-,7-}. 
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3. Choose a subset of three suits (out of four) for the value that occurs three times. By the 
Subset Rule, this can be done in G) = 4 ways. In our example, we choose the subset 
{9, d&, &}, leading to the partial hand {49, 4&, 4@, 7-, 7-}. 


4. Choose a subset of two suits (out of four) for the value that occurs twice. By the Subset 
Rule, this can be done in ie) = 6 ways. In our example, we choose the subset {@, V} 
for the 7’s, giving us the completed hand displayed above. 


The total number of full house hands is 13-12-4-6 = 3744. The probability of a full house 
is 3744/(°?) ~ 0.00144. 


1.31. Example: Numbers. How many six-digit numbers contain exactly three copies of 
the digit 7? Since the first digit is special (it cannot be zero), we create two categories: let S$; 
be the set of such numbers that start with 7, and let Sj be the set of such numbers that do 
not start with 7. To build a number in S14, first put 7 in the first position (n, = 1 way); then 
choose a subset of two of the remaining five positions for the other 7’s (nz = ) = 10 ways, 
by the Subset Rule); then fill the three unused positions from left to right with digits other 
than 7 (n3 = 9° ways, by the Word Rule). The Product Rule gives |.$1| = 1- 10-9? = 7290. 
To build a number in $9, first choose a subset of three of the five non-initial positions to 
contain 7’s (ny = (3) = 10 ways, by the Subset Rule); then choose a digit unequal to 0 or 
7 for the first position (ng = 8 ways); then choose the two remaining digits (ng = n4 = 9 
ways). We see that |.S'2| = 10-8-9-9 = 6480, so the answer is 7290 + 6480 = 13770. Another 
approach to this problem leads to the formula ($)9° — (3)9? = 13770; can you reconstruct 
the counting argument that leads to this formula? 
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1.5 Counting Anagrams 


For our next counting problem, we enumerate words where each letter appears a specified 
number of times. 


1.32. Definition: Anagrams. Suppose aj,...,a, are distinct letters from an alphabet A 


and nj1,...,N% are nonnegative integers. Let R(aj!a5?---a,*) denote the set of all words 


WwW = W1W2°-:+W, that are formed by rearranging n; copies of a, nz copies of ag, ..., and 
Nz copies of a, (so that n = ny + n2+---+ nx). Words in a given set R(ay’---a;,") are 
said to be anagrams or rearrangements of one another. 


1.33. Example. The set of all anagrams of the word 00111 is 
R(0213) = {00111, 01011, 01101, 01110, 10011, 10101, 10110, 11001, 11010, 11100}. 
The set R(a'b?c!d°) consists of the words 
{abbc, abcb, acbb, babc, bacb, bbac, bbca, bcab, bcba, cabb, cbab, cbba}. 


1.34. The Anagram Rule. Suppose aj1,...,ax are distinct letters, n1,...,n% are nonneg- 
ative integers, and n =n, +---+ nx. Then 


We give two proofs of this result. First Proof: We give a combinatorial argument similar 
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to our proof of the Subset Rule. Define a new alphabet A consisting of n distinct letters by 
attaching distinct numerical superscripts to each copy of the given letters a1,...@x: 
(1) (n2) (1) (ma) 


A= {a a, ..., al, af J ave dg paws putayy 

Let « = |R(aj'a5?---a;*)|. Let S be the set of all permutations of the n-letter alphabet 
A. We count |S| in two ways. On one hand, we know |S| = n! by the Permutation Rule. On 
the other hand, the following sequence of choices constructs each object in S' exactly once. 
First, choose a word v € R(aj!---a;") in any of x ways. Second, choose a linear ordering 
of the superscripts 1 through n; and attach these superscripts (in the chosen order) to the 
ny copies of a; in v. By the Permutation Rule, this second choice can be made in n,! ways. 
Third, choose a linear ordering of 1 through ng and attach these superscripts to the ne 
copies of az in v; there are ng! ways to make this choice. Continue similarly; at the last 
stage, we attach the superscripts 1 through nz, to the nz, copies of a, in v in any of nx! 
ways. By the Product Rule, || = a+ 1n1!-mg!-...+ng!. Since |S] = n! also, solving for x 
gives the formula asserted in the Anagram Rule. 

Second Proof. This proof makes repeated use of the Subset Rule followed by an algebraic 
manipulation of factorials. We construct a typical object w = wiw2---Wn € R(ay?--- az") 
by making the following sequence of k choices. Intuitively, we are going to choose the 
positions of the a,’s, then the positions of the ag’s, etc. First, choose any n;-element subset 
S, of {1,2,...,n} in any of () ways, and define w; = a, for all i € S,. Second, choose 
any n-element subset $2 of the n — n , unused positions in any of (a) ways, and define 
w; = ag for all i € So. At the jth stage (where 1 < j < k), we have already filled the 
positions in S; through S;_1, so there are n — nj — ng — ++: — nj-1 remaining positions 
in the word. We choose any nj-element subset 5; of these remaining positions in any of 
Carat ways, and define w; = a; for all i € S;. By the Product Rule, the number 


of rearrangements is 


nm nm — Ny m— Nn y— ng m—-—Ny—*+++— NeE-1 
ny n2 N3 Nk 


This is a telescoping product that simplifies to n!/(n1!nq!---ng!). For instance, when k = 4, 
the product is 


NS 
a 
2] S 
3 
ran 
3 
S 
| 


my '(n— 741)! nel(n—n1 — 12)! ngl(n— ny — ne — 13)! nal(n — ny — ng — 3 — n4)! 


which simplifies to n!/(ni!n2!ng!n4!), using the fact that (n—n1 — ng —---— ng)! =O! = 1. 


1.35. Example. We now illustrate the constructions in each of the two preceding proofs. 
For the first proof, suppose we are counting R(a?b'c*). The alphabet A in the proof consists 
of eight distinct letters: 


A= {a a), a, 0M, 


Let us build a specific permutation of A using the second counting method. First, choose an 
element of R(a°b'c*), say v = baccaacc. Second, choose a linear ordering of the superscripts 
{1,2,3}, say 312, and label the a’s from left to right with these superscripts, obtaining 
ba) cca a) ce. Third, choose a linear ordering of {1}, namely 1, and label the b with this 
superscript, producing b™a®) cca a) ce. Finally, choose a linear ordering of {1,2,3, 4}, 
say 1243, and label the c’s accordingly to get bY a9 e&YVceVaVa%e%c@), We have now 
constructed a permutation of the alphabet A. 
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Next, let us see how to build the word baccaacc using the method of the second proof. 
Start with an empty 8-letter word, which we denote ——-————~—-— . We first choose the 3- 
element subset {2,5,6} of {1,2,...,8} and put a’s in those positions, obtaining the partial 
word —a——aa-—~. We then choose the 1-element subset {1} of {1,3,4,7,8} and put a b in 
that position, obtaining ba-—aa-—-—. Finally, we choose the 4-element subset {3, 4,7, 8} of 
{3,4,7,8} and put c’s in those positions, obtaining the final word baccaacc. 


1.36. Definition: Multinomial Coefficients. Suppose n,,...,n,% are nonnegative inte- 
gers andn = ni +--:+ ng. The multinomial coefficient is 


n! 


( i ) = Ctra may.) = 


N1,N2,...,Nk ny !no!---+ np! 


This is the number of rearrangements of k letters where there are n; copies of the 7th letter. 


In particular, (e ed a) is always an integer. 


Binomial coefficients are a special case of multinomial coefficients. Indeed, it is immediate 


from the definitions that 
a+b\ (a+b)! fa+b 
a ~ alll a,b 


1.37. Example. (a) Count the number of anagrams of MADAMIMADAM. (b) In how 
many anagrams do the four M’s occur consecutively? (c) The word in (a) is a palindrome 
(a word that reads the same forward and backward). How many anagrams of this word are 
also palindromes? 

We solve (a) using the Anagram Rule: we seek the size of R(A*D?I'M*), which is 


(ees) = 11!/(4!2!1!4!) = 34650. Part (b) can also be solved with the Anagram Rule, if 


we think of the four consecutive M’s as a single meta-letter | MMMM |. Here the answer is 


1 4424141 | 
IR (A‘D*r MMMM )| i ( y ae ‘i ) = — = 840. 


? o] b 


for all integers a,b > 0. 


For (c), we can build the palindromic anagrams as follows. First, put the sole copy of I in 
the middle position (1 way). Second, choose an anagram of MADAM to place before the 
Tin eeey = 30 ways. Third, complete the word by filling the last five positions with the 
reversal of the word in the first five positions (1 way). The Product Rule gives 30 as the 
answer. 


DS 


1.6 Counting Rules for Set Operations 


In set theory, one studies various operations for combining sets such as unions, intersections, 
set differences, and Cartesian products. We now examine some counting rules giving the 
sizes of the sets produced by these operations. 

We begin with Cartesian products. Recall that S x T is the set of all ordered pairs 
(x,y) with « € S and y € T. More generally, the Cartesian product of k sets $1,.$9,...,Sx, 
denoted S$; x Sg x---x Sx, is the set of all ordered k-tuples (a1, a2,...,@,) such that a; € S; 
for 1<i<k. We write S* for the Cartesian product of k copies of the set S. 


1.38. The Cartesian Product Rule. Given k finite sets S1,...,S%, |S1 x S2x-+++x Sz] = 
|$1| -|S2|-...+|S,|. In particular, |S*| = |S|*. 
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Proof. We build a typical object (a1,...,a%) € Si x +--+ x S, by making a sequence of k 
choices. First pick a; in n; = |S1| ways. Then pick a2 in ng = |S2| ways. Continue similarly; 
for 1 <i <k, pick a; in n; = |S;| ways. By the Product Rule, the product set has size 
NyNg+**Np. O 


Next we consider unions of sets. Recall that the union AU B consists of all objects x 
such that « € A or x € B. The intersection AN B consists of all objects x such that « € A 
and « € B. More generally, S; US2U---U Sz consists of all x such that x € S; for some i 
between 1 and k; and $,;952---MS, consists of all x such that 2 € 5; for every 7 between 
1 and k. Two sets A and B are disjoint iff AN B = @ (the empty set). We say that sets 
Si,...,S% are pairwise disjoint iff for all i 4 j, S;.S; =. The next rule is nothing more 
than a restatement of the Sum Rule in the language of sets. 


1.39. The Disjoint Union Rule. For all pairwise disjoint finite sets S),..., Sz, 
|S; US_gU-++-U Sx] = [Si] + [So] +--+ + [Sz]. 


For sets A and B, the set difference A—B consists of all x such that « € A and x ¢ B. 
We do not require that B be a subset of A here. 


1.40. The Difference Rule. (a) If T is a subset of a finite set S, |S—T| = |S| — |T). 
(b) For all finite sets S' and all sets U, |S—U]| = |S|—|SNU. 


Proof. To prove (a), assume T C S. Then S is the union of the two disjoint sets T and 
S—T, so |S| = |T|+|S—T| by the Disjoint Union Rule. Subtracting |T'| from both sides, we 
get (a). To prove (b), take T = SMU, which is a subset of S. Since S—U = S—T, we can 
use (a) to conclude that |S—U| = |S—T| = |S|—|T| =|S|-—|Sn UI. oO 


The next rule gives a formula for the size of the union of two arbitrary (not necessarily 
disjoint) finite sets. 


1.41. The Union Rule for Two Sets. For all finite sets S and T, |S UT| = |S| + |T| — 
|S TI. 


Proof. The set SUT is the union of the three disjoint sets S—T, SMT, and T—S. By the 
Disjoint Union Rule and the Difference Rule, 


|ISUT| =|S-T|+|SOT|+|T—-S| = |S|-|SNT|+|SAT|+ |T|-|TN SI, 
which simplifies to |S] + |T| — |S T]. Oo 


By repeatedly using the Union Rule for Two Sets, we can deduce similar formulas for 
the union of three or more sets. For example, given any finite sets A, B, C, it can be shown 
(Exercise 1-39) that 


|AUBUC| =|A\/4+ |B] +|C] -—|AN B) -|ANC|—-|BNC|+|ANBNC|. (1.1) 


The most general version of the Union Rule is called the Inclusion-Exclusion Formula; we 
discuss it in Chapter 4. 


1.42. Example. (a) How many five-letter words in the alphabet {A,B,C,D,E,F,G} start 
with a vowel or end with a consonant? (b) How many five-letter words contain the letters 
C and F? 

For (a), let S' be the set of five-letter words starting with a vowel, and let T be the set 
of five-letter words ending with a consonant. We seek |S UT|. By the Word Rule and the 
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Product Rule, |S| = 2-74, |T| = 74-5, and |S NT|=2-7%-5. By the Union Rule for Two 
Sets, |S UT| = |S| + |Z] — |S AT] = 13377. 

To solve (b), we use some negative logic. Let X be the set of all five-letter words; let S be 
the set of words in X that do not contain C, and let T be the set of words in X that do not 
contain F. Each word in S'UT has no C or has no F; so each word in Y = X—(SUT) has at 
least one C and at least one F. To find |Y|, we use the Union Rule and the Difference Rule. 
By the Word Rule, |X| = 7° and |$| = 6° = |T|. Words in SMT are five-letter words using 
the alphabet {A,B,D,E,G}, hence |SNT| = 5°. Now, |SUT| = |S|+|T|—|SNT| = 2-6°—5°, 
so |Y| = |X| —|SUT| = 7° —2-6° +5° = 4380. 
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1.7 Probability 


The counting techniques discussed so far can be applied to solve many problems in probabil- 
ity theory. This section introduces some fundamental concepts of probability and considers 
several examples. 


1.43. Definition: Sample Spaces and Events. A sample space is a set S whose members 
represent the possible outcomes of a random experiment. In this section, we only consider 
finite sample spaces. An event is a subset of the sample space. 


Intuitively, an event consists of the set of outcomes of the random experiment that 
possess a particular property we are interested in. We imagine repeating the experiment 
many times. Each run of the experiment produces a single outcome x € S; we say that an 
event A C S$ has occurred on this run of the experiment iff x is in the subset A. 


1.44. Example: Coin Tossing. Suppose the experiment consists of tossing a coin five 
times. We could take the sample space for this experiment to be S = {H,T}°, the set of all 
5-letter words using the letters H (for heads) and T (for tails). The element HHHTH € S$ 
represents the outcome where the fourth toss was tails and all other tosses were heads. The 
subset A = {w € S : wo = H} is the event in which the second toss comes up heads. The 
subset B = {w € S: wi 4 ws} is the event that the first toss is different from the last toss. 
The subset 
C = {we S:w;=T for an odd number of indices i} 


is the event that there are an odd number of tails. 


1.45. Example: Dice Rolling. Suppose the experiment consists of rolling a six-sided die 
three times. The sample space for this experiment is S = {1,2,3,4,5,6}°%, the set of all 
3-letter words over the alphabet {1,2,...,6}. The subset A = {w € S: wi + we +3 € 
{7,11}} is the event that the sum of the three numbers rolled is 7 or 11. The subset 
B={weS:w), =w2 = ws} is the event that all three numbers rolled are the same. The 
subset C = {w € S: w # (4,1,3)} is the event that we do not see the numbers 4, 1, 3 (in 
that order) in the dice rolls. 


1.46. Example: Lotteries. Consider the following random experiment. We put 49 white 
balls (numbered 1 through 49) into a machine that mixes the balls and then outputs a 
sequence of six distinct balls, one at a time. We could take the sample space here to be 
the set S” of all 6-letter words w consisting of six distinct letters from A = {1,2,...,49}. 
In lotteries, the order in which the balls are drawn usually does not matter, so it is more 
common to take the sample space to be the set S' of all 6-element subsets of A. (We will 
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see later that using S instead of S’ does not affect the probabilities we are interested in.) 
Suppose a lottery player picks a (fixed and known) 6-element subset Tp of A. For0 <k < 6, 
define events B, = {T € S: |TOTp| = k} C S. Intuitively, the event By, is the set of 
outcomes in which the player has matched exactly k of the winning lottery numbers. 


1.47. Example: Special Events. For any sample space S$, # and S are events. The event 
—) contains no outcomes, and therefore never occurs. On the other hand, the event S contains 
all the outcomes, and therefore always occurs. If A and B are events (i.e., subsets of S$), 
note that AUB, ANB, S—A, and A—B are also events. Intuitively, AU B is the event that 
either A occurs or B occurs (or both); AN B is the event that both A and B occur; S—A 
is the event that A does not occur; and A—B is the event that A occurs but B does not 
occur. 


Now we can formally define the concept of probability. Intuitively, for each event A, we 
want to define a number P(A) that measures the probability or likelihood that A occurs. 
Numbers close to 1 represent more likely events, while numbers close to 0 represent less 
likely events. A probability-zero event is (virtually) impossible, while a probability-one 
event is (virtually) certain to occur. In general, if we perform the random experiment many 
times, P(A) should approximately equal the ratio of the number of trials where A occurs to 
the total number of trials performed. These intuitive ideas are formalized in the following 
mathematical axioms for probability. 


1.48. Definition: Probability. Assume S$ is a finite sample space. A probability measure 
for S is a function P assigning to each event A C S' a real number P(A) € [0,1] such that 
P(0) = 0, P(S) = 1, and for any two disjoint events A and B, P(AU B) = P(A) + P(B). 


By induction, it follows that P satisfies the finite additivity property 
P(A; U Ag U-++U Ay) = P(A) + P(A) +--+ + P(An) 


for all pairwise disjoint sets A1, A2,..., An CG S. Moreover, it can be shown (Exercise 1-51) 
that for all events A and B, P(A— B) = P(A)— P(ANB) and P(AUB) = P(A)+ P(B)- 
P(ANB). 


1.49. Example: Classical Probability Spaces. Suppose S is a finite sample space in 
which all outcomes are equally likely. Then we must have P({x}) = 1/|.S| for each outcome 
x € S. For any event A C S, finite additivity gives 


|A| | number of outcomes in A 


P(A) (1.2) 


~ |S| total number of outcomes’ 


Thus the calculation of probabilities (in this classical setup) reduces to two counting prob- 
lems: counting the number of elements in A and counting the number of elements in S. We 
can take Equation (1.2) as the definition of our probability measure P. Note that the axiom 
“ANB = 9 implies P(AU B) = P(A)+ P(B)” is then a consequence of the Sum Rule. Also 
note that this probability model will only be appropriate if all the possible outcomes of the 
underlying random experiment are equally likely to occur. 


1.50. Example: Coin Tossing. Suppose we toss a fair coin five times. The sample space 
is S = {H,T}°, so that |S| = 2° = 32. Consider the event A = {w € S : wo = H} 
of getting a head on the second toss. By the Product Rule, |A| = 2-1-2? = 16, so 
P(A) = 16/32 = 1/2. Consider the event B = {w € S': w; # ws} in which the first toss 
differs from the last toss. B is the disjoint union of By = {w € S: w,; = H,ws = T} and 
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By = {w €S:w, = T,ws = H}. The Product Rule shows that |B,| = |Bz| = 22 = 8, so 
that P(B) = (8+ 8)/32 = 1/2. Finally, consider the event 


C = {w € S$: w; =T for an odd number of indices i}. 


C is the disjoint union C,; UC3UCs, where (for 0 < k < 5) Cy is the event of getting exactly 
k tails. Observe that C, = R(T"H°~*). So |Cy| = rey) = (?) by the Anagram Rule, 
PO) S(\/ 2? aad 
5 5 5 
+()+ 
ro G) +) +6) \) () = 16/42 = 172; 
1.51. Example: Dice Rolling. Consider the experiment of rolling a six-sided die twice. 
The sample space is S = {1,2,3,4,5,6}°, so that |$| = 6? = 36. Consider the event 
A={xES:41+22 € {7,11}} of rolling a sum of 7 or 11. By direct enumeration, we have 


A= 414 6), (2, 5), (3, 4), (A, 3), (5, 2), (6, 1), (5, 6), (6, 5)}; |A| = 8. 


Therefore, P(A) = 8/36 = 2/9. Consider the event B = {7 € S: a, 4 x2} of getting 
two different numbers on the two rolls. The Product Rule gives |B] = 6-5 = 30, so 
P(B) = 30/36 = 5/6. 


1.52. Example: Balls in Urns. Suppose an urn contains n, red balls, nz white balls, and 
n3 blue balls. Let the random experiment consist of randomly drawing a k-element subset 
of balls from the urn. What is the probability of drawing k, red balls, k2 white balls, and 
k3 blue balls, where ki + kg + k3 = k? We can take the sample space S to be all k-element 
subsets of the set 


{1,2,...,m1,n1+1,...,n1 + ne,n1t+no41,...,n1 + ne 4+ nz}. 


Here the first mn; integers represent red balls, the next nz integers represent white balls, 
and the last ng integers represent blue balls. By the Subset Rule, |S| = ("°*72'"*). Let 
A be the event where we draw k, red balls, kg white balls, and k3 blue balls. To build a 
set T € A, we choose a k,-element subset of {1,2,...,n1}, then a kp-element subset of 
{ny +1,...,m1 + ng}, then a k3-element subset of {ny + ne + 1,...,n1 + ne + ng}. By 
the Subset Rule and Product Rule, |A| = (j;') (72) (;2). Therefore, the definition of the 
probability measure gives 


ny ne n3 ny t+n2+ 73 
P(A) = : 
4) & @ ay & + ko+ isd 
This calculation can be generalized to the case where the urn has balls of more than three 
colors. 


1.53. Example: General Probability Measures on a Finite Sample Space. We can 
extend the previous discussion to the case where not all outcomes of the random experiment 
are equally likely. Let S' be a finite sample space and let p be a function assigning to each 
outcome x € S$ a real number p(x) € [0,1], such that }7.-4 p(x) = 1. Intuitively, p(x) is 
the probability that the outcome x occurs. Now p is not a probability measure, since its 
domain is S (the set of outcomes) instead of P(S) (the set of events). We build a probability 
measure from p by defining P(A) = >0,<,4 p(x) for each event A C S. The axioms for a 
probability measure may be routinely verified. 


1.54. Remark. In this section, we used counting techniques to solve basic probability 
questions. It is also possible to use probabilistic arguments to help solve counting problems. 
Examples of such arguments appear in §12.4 and §12.12. 
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1.8 Lotteries and Card Games 


In this section, we give more examples of probability calculations by analyzing lotteries, 
bridge, and five-card poker. 


1.55. Example: Lotteries. Consider the lottery described in Example 1.46. Here the 
sample space S$ consists of all 6-element subsets of A = {1,2,...,49}, so |S] = (@) = 
13, 983, 816. Suppose a lottery player picks a (fixed and known) 6-element subset To of A. 
For 0 < k < 6, define events B, = {T € S: |[.NT >| = k}. B, occurs when the player 
matches exactly k of the winning numbers. We can build a typical object T € B; by choosing 
k elements of To in (2) ways, and then choosing 6 — k elements of A—Tp in (Fae) ways. 


_ roo (9)(.2,)/(%) 


We compute P(B3) © 0.01765, P(B4) © 0.001, P(Bs) © 1.8 x 107°, and P(Bg) = 1/|S| = 
7.15 x 1078. We can view this example as the special case of Example 1.52 where the urn 
contains 6 balls of one color (representing the winning numbers) and 43 balls of another 
color (representing the other numbers). 


In the lottery example, we could have taken the sample space to be the set S$” of all 
ordered sequences of six distinct elements of {1,2,...,49}. Let By, be the event that the 
player guesses exactly k numbers correctly (disregarding order, as before). Let P’ be the 
probability measure on the sample space $’. It can be checked that |.$’| = era) - 6! and 
|Bi| = (?) Ce -6!, so that 


This confirms our earlier remark that the two sample spaces S and S” give the same prob- 
abilities for events that do not depend on the order in which the balls are drawn. 


1.56. Example: Powerball. A powerball lottery has two kinds of balls: white balls (num- 
bered 1,..., 44) and red balls (numbered 1,..., &). Each week, one red ball and a set of n 
distinct white balls are randomly chosen. Lottery players separately guess the numbers of 
the n white balls and the red ball, which is called the powerball. Players win prizes based 
on how many balls they guess correctly. Players always win a prize for matching the red 
ball, even if they incorrectly guess all the white balls. 

To analyze this lottery, let the sample space be 


S={(T,x):T is an n-element subset of {1,2,...,M} and a € {1,2,..., R}}. 


Let (To, 20) be a fixed and known element of S representing a given player’s lottery ticket. 
For 0 <k <n, let Ay be the event {(T,2) € S:|TATo| =k, x F xo} in which the player 
matches exactly k white balls but misses the powerball. Let By, be the event {(T,2) € S: 
|T Q To| = k,x = xo} in which the player matches exactly k white balls and also matches 
the powerball. We have || = (“)R by the Subset Rule and the Product Rule. To build 
a typical element in Ag, we first choose k elements of To, then choose n — k elements of 
{1,2,..., M}~-Tp, then choose x € {1,2,..., R}—{xo}. Thus, |Ax| = (%)(“7") (BR — 1), so 


() G(R =D 
(“)R . 


n 


P(Ag) = 
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TABLE 1.1 
Analysis of Powerball. 
Matches 

0 white, 1 red 
1 white, 1 red 
2 white, 1 red 
3 white, 0 red 
3 white, 1 red 
4 white, 0 red 
4 white, 1 red 
5 white, 0 red 
5 white, 1 red 


Prize Value 


Probability 
0.0261 or 1 in 38 
0.0109 or 1 in 92 
0.00143 or 1 in 701 

0.00172 or 1 in 580 

0.000069 or 1 in 14,494 
0.0000274 or 1 in 36,525 
0.0000011 or 1 in 913,129 

8.56 x 10-8 or 1 in 11.7 million 
3.422 x 10-9 or 1 in 292 million 


$1,000,000 
Jackpot 


Note: The cost of a ticket is $2. 


Similarly, 


In one version of this lottery, we have M = 69, R = 26, and n = 5. The probabilities of 
certain events A; and B; are shown in Table 1.1 together with the associated prize amounts. 


1.57. Example: Bridge. A bridge game has four players, called North, South, East, and 
West. All cards in the deck are dealt to the players so that each player receives a set of 13 
cards. (a) How big is the sample space S? (b) Find the probability that North receives all 
four aces and South is void in clubs (i.e., has no clubs). 

For (a), we can build an object in S by choosing a set of 13 cards out of 52 for North 
(Ga) ways), then choosing a set of 13 out of the remaining 39 cards for South (e ways), 
then choosing a set of 13 of the remaining 26 cards for East (Gs) ways), then choosing 
13 of the remaining 13 cards for West (Ge) = 1 way). The resulting product of binomial 
coefficients simplifies to the multinomial coefficient Ceerere) This is not a coincidence: 


we can also think of an element of S as an anagram w € R(N'°S'3EW'®), as follows. 
Let C,C2---Cs2 be a fixed ordering of the 52 cards. Use the anagram w = wy ,-:--wWs52 to 
distribute cards to the bridge players by giving card C; to North, South, East, or West 
when w; is N, S, E, or W, respectively. This encoding of bridge hands via anagrams is an 
instance of the Bijection Rule, discussed in §1.11 below. 

Let B be the event described in (b). We build an outcome in B as follows. First give North 
the four aces (one way). There are now 48 cards left, 12 clubs and 36 non-clubs. Choose a 
set of 13 of the 36 non-clubs for South (38) ways). Now give North 9 more of the remaining 
35 cards ee ways), then choose 13 of the remaining 26 cards for East (78) ways), then 


give the remaining 13 cards to West (1 way). By the Product Rule, |B] = (33) (%) (8). 
Then ie ae 
(13) (9) 
52) (39 
(13) (is) 


Next we compute more probabilities associated with five-card poker. 


P(B) = ~ 0.000032. 


1.58. Example: Four-of-a-Kind Hands. A four-of-a-kind poker hand is a five-card 
hand such that some value in {2,3,...,10, J,Q, K, A} appears four times in the hand. For 
example, {30 3d, 5d, 3, 3@} is a four-of-a-kind hand. To find the probability of such a 
hand, we use the sample space S' consisting of all five-element subsets of the 52-card deck. 
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We know |S| = (3 by the Subset Rule. Define the event A to be the subset of S' consisting 
of all four-of-a-kind hands. We build a typical hand in A by choosing the value to occur 
four times (there are n; = 13 possibilities), then putting all four cards of that value in the 
hand (in ng = (3) = 1 way), then selecting a fifth card (n3 = 48 choices). By the Product 
Rule, |A| = 13- 1-48 = 624. So P(A) = |A|/|S| + 0.00024. 


1.59. Example: One-Pair Hands. A one-pair poker hand is a five-card hand containing 
four values, one of which occurs twice. For example, {50,9, 9%, JV, A@} is a one-pair 
hand. Let B be the event consisting of all one-pair hands. Build an object in B via these 
choices: first choose the value that appears twice (nj = 13 ways); then choose a subset of 
two cards out of the four cards with this value (no = Gs) = 6 ways); then choose a subset 
of three values from the twelve unused values (n3 = Go = 220 ways); then choose a suit 
for the lowest of these three values (n4 = 4 ways); then choose a suit for the next lowest 
value (n5 = 4 ways); then choose a suit for the last value (ng = 4 ways). The Product Rule 
gives |B| = 1,098,240, so P(B) = |B|/|S| + 0.4226. For instance, the sample hand above 
is built from the following choice sequence: choose value 9, then suits { >, &}, then values 
{5, J, A}, then suit 9 for the 5, then suit OU for the jack, then suit @ for the ace. (Compare 
to Exercise 1-27.) 


1.60. Example: Discarding Cards. Suppose we are dealt the five-card poker hand H = 
{39 4d, 5d, 90, Kd} from a 52-card deck. We now have the opportunity to discard k cards 
from our hand and receive k new cards. (a) If we discard the 9 and the king, what is the 
probability we will be dealt a straight hand? (b) If we discard the 3 and 9, what is the 
probability we will be dealt a flush? 

In both parts, we can take the sample space to be the set S' of all two-element subsets 
of the 47-element set consisting of the deck with the cards in H removed. Thus, |.$| = 
ey) = 1081. For (a), we can build a straight starting from the partial hand {39, 4d, 5d} 
as follows: choose the low value for the straight (there are three possibilities: ace, two, or 
three); choose the suit for the lowest value not in the original hand (four ways); choose the 
suit for the remaining value (four ways). The number of hands is 3-4-4 = 48, and the 
probability is 48/1081 ~ 0.0444. 

For (b), we can build a flush starting from the partial hand {44, 5é&, Kd} by choosing 
a subset of two of the remaining ten club cards in Ge ) = 45 ways. The probability here is 
45/1081 * 0.0416, just slightly less than the probability in (a). 
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1.9 Conditional Probability and Independence 


Suppose that, in a certain random experiment, we are told that a particular event has 
occurred. Given this additional information, we can recompute the probability of other 
events occurring. This leads to the notion of conditional probability. 


1.61. Definition: Conditional Probability. Suppose A and B are events in some sample 
space S' such that P(B) > 0. The conditional probability of A given B, denoted P(A|B), is 
defined by setting P(A|B) = P(AN B)/P(B). In the case where S is a finite set of equally 
likely outcomes, we have P(A|B) = |ANM B|/|BI. 


To motivate this definition, suppose we know for certain that event B has occurred on 
some run of the experiment. Given this knowledge, another event A also occurred iff AN B 
has occurred. This explains why P(ANM B) appears in the numerator. We divide by P(B) 
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to normalize the conditional probabilities, so that (for instance) P(B|B) = 1. The next 
example shows that the conditional probability of A given some other event can be greater 
than, less than, or equal to the original unconditional probability P(A). 


1.62. Example: Dice Rolling. Consider the experiment of rolling a fair die twice. What 
is the probability of getting a sum of 7 or 11, given that the second roll comes up 5? Here, 
the sample space is S = {1,2,3,4,5,6}?. Let A be the event of getting a sum of 7 or 
11, and let B be the event that the second die shows 5. We have P(B) = 1/6, and we 
saw earlier that P(A) = 2/9. Listing outcomes, we see that AN B = {(2,5), (6,5)}, so 
P(AN B) = 2/36 = 1/18. Therefore, the required conditional probability is 


P(ANB) _ 1/18 


a 0) a 


= 1/3 > 2/9 = P(A). 


On the other hand, let C be the event that the second roll comes up 4. Here ANC = {(3,4)}, 
so 


1/36 
Next, let D be the event that the first roll is odd. Then AN D = {(1,6), (3, 4), (5, 2), (5,6)}, 
so 
4/36 
= — =2/9=P(A). 


1.63. Example: Balls in Urns. Suppose an urn contains r red balls and 6 blue balls, 
where r,b > 2. Consider an experiment in which two balls are drawn from the urn in 
succession, without replacement. What is the probability that the first ball is red, given 
that the second ball is blue? We take the sample space to be the set S of all words w,wo, 
where w; 4 w2 and 

wy, we € {1,2,...,rnr41,...,r+b}. 


Here, the numbers 1 through r represent red balls, and the numbers r + 1 through r + b 
represent blue balls. The event of drawing a red ball first is the subset 


A= {wiwo:1<wi <r}. 
The event of drawing a blue ball second is the subset 
B= {wiw2:rt1l<we<rt+odt}. 


By the Product Rule, |.S| = (r+ 6)(r + 6-1), |A| = r(r +b — 1), |B] = b(r + b— 1), and 
|AM B| = rb. The conditional probability of A given B is 


P(A|B) = P(AN B)/P(B) =r/(r+b-1). 
In contrast, the unconditional probability of A is 
P(A) = |AJ/|S] = r/(r + 6). 


The conditional probability is slightly higher than the unconditional probability; intuitively, 
we are more likely to have gotten a red ball first if we know the second ball was not red. 
The probability that the second ball is blue, given that the first ball is red, is 


P(B|A) = P(BNA)/P(A) = b/(r +b— 1). 


Note that P(B|A) 4 P(A|B) (unless r = 0). 
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1.64. Example: Card Hands. What is the probability that a 5-card poker hand is a full 
house, given that the hand is void in hearts (i.e., no card in the hand is a heart)? Let A 
be the event of getting a full house, and let B be the event of being void in hearts. We 
have |B| = (*°) = 575,757 since we must choose a five-element subset of the 52 — 13 = 39 
non-heart cards. Next, we must compute |AM B|. To build a full house hand using no 
hearts, make the following choices: first, choose a value to occur three times (13 ways); 
second, choose the suits for this value (1 way, as hearts are forbidden); third, choose a 
value to occur twice (12 ways); fourth, choose the suits for this value (3) = 3 ways). By 
the Product Rule, |AN B| = 13-1-12-3 = 468. Accordingly, the probability we want is 
P(A|B) = 468/575, 757 = 0.000813. 

Next, what is the probability of getting a full house, given that the hand has at least 
two cards of the same value? Let C’ be the event that at least two cards in the hand have 
the same value; we seek P(A|C) = P(ANC)/P(C) = |ANC|/|C|. The numerator here can 
be computed quickly: since A C C, we have ANC = A and hence |AN C| = |A| = 3744 
(see Example 1.30). To compute the denominator, let us first enumerate S—C, where S' is 
the full sample space of all five-card poker hands. Note that S—C occurs iff all five cards 
in the hand have different values. Choose these values ((') ways), and then choose suits 
for each card (4 ways each). By the Product Rule, |S—C| = ('2)4° = 1,317,888. By the 
Difference Rule, |C] = |S| — |S—C| = 1,281,072. The required conditional probability is 
P(A|C) = |ANC|/|C| + 0.00292. 


In some situations, the knowledge that a particular event D occurs does not change the 
probability that another event A will occur. For instance, events D and A in Example 1.62 
have this property because P(A|D) = P(A). Writing out the definition of P(A|D) and 
multiplying by P(D), we see that the stated property is equivalent to P(AND) = P(A)P(D) 
(assuming P(D) > 0). This suggests the following definition, which is valid even when 
P(D) =0. 


1.65. Definition: Independence of Two Events. Two events A and D are called 
independent iff P(AN D) = P(A)P(D). 


Unlike the definition of conditional probability, this definition is symmetric in A and 
D. So, A and D are independent iff D and A are independent. As indicated above, when 
P(D) > 0, independence of A and D is equivalent to P(A|D) = P(A). Similarly, when 
P(A) > 0, independence of A and D is equivalent to P(D|A) = P(A). So, when considering 
two independent events of positive probability, knowledge that either event has occurred 
gives us no new information about the probability of the other event occurring. 


1.66. Definition: Independence of a Collection of Events. Suppose Aj,..., An are 
events. This list of events is called independent iff for all choices of indices 1 < i, < ig < 


1.67. Example. Let S = {a, b,c, d}, and suppose each outcome in S occurs with probability 
1/4. Define events B = {a,b}, C = {a,c}, and D = {a, d}. One verifies immediately that B 
and C' are independent; B and D are independent; and C' and D are independent. However, 
the list of events B,C, D is not independent, because 


P(BNCND) = P({a}) = 1/44 1/8 = P(B)P(C)P(D). 


1.68. Example: Coin Tossing. Suppose we toss a fair coin five times. Take the sample 
space to be S = {H,T}°. Let A be the event that the first and last toss agree; let B be 
the event that the third toss is tails; let C' be the event that there are an odd number of 
heads. Routine counting arguments show that |S| = 2° = 32, |A| = 2* = 16, |B| = 24 = 16, 
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ICl = G)+G)+@) = 16, [ANB] = 23 = 8, |ANc| = 2((7) + (8)) = 8, |BNC| = (7) + (3) =8, 
and |AN BN C| = 4. It follows that 


P(AN B) = P(A)P(B); P(ANC)=P(A)P(C); P(BNC) = P(B)P(C); 
and P(AN BNC) = P(A)P(B)P(C). 
Thus, the list of events A, B,C is independent. 


We often assume that unrelated physical events are independent (in the mathematical 
sense) to help us construct a probability model. The next examples illustrate this process. 


1.69. Example. Urn 1 contains 3 red balls and 7 blue balls. Urn 2 contains 2 red balls 
and 8 blue balls. Urn 3 contains 4 red balls and 1 blue ball. If we randomly choose one ball 
from each urn, what is the probability that all three balls have the same color? 

Let R be the event that all three balls are red, and let B be the event that all three 
balls are blue. Then R = Ri M1 RoM R3 where R; is the event that the ball drawn from urn 
i is red. Since draws from different urns should not affect one another, it is reasonable to 
assume that the list Ry, Ro, R3 is independent. So 


P(R) = P(R, 0 R2N R3) = P(R1)P(R2)P(R3) = (3/10) - (2/10) - (4/5) = 0.048. 


Similarly, P(B) = (7/10) - (8/10) - (1/5) = 0.112. The events R and B are disjoint, so the 
answer is P(RU B) = P(R) + P(B) = 0.16. 


1.70. Example: Tossing an Unfair Coin. Consider a random experiment in which we 
toss an unbalanced coin n times in a row. Assume the coin comes up heads with probability 
q and tails with probability 1—q, and successive coin tosses are unrelated to one another. Let 
the sample space be S = {H, T}". Since the coin is unfair, it is not appropriate to assume 
that every point of S occurs with equal probability. Given an outcome w = w,w2---Wn € S, 
what should the probability p(w) be? Consider an example where n = 5 and w = HHTHT. 
Define five events B, = {2 € S: 2, =H}, Bp ={z€S:2=H}, BB = {ze S:2= 
T}, Bs = {2 € S: 24 = H}, and Bs = {z € S: z; = T}. Our physical assumptions 
suggest that B,,...,B5 should be independent events because different tosses of the coin 
are unrelated. Moreover, P(B,) = P(B2) = P(B,4) = q, and P(Bs) = P(Bs) = 1-4. Since 
BLN B21 B30 B4N Bs = {w}, the definition of independence leads to 


p(w) = P(B,N---N Bs) = P(By)P(B2)--: P(Bs) = aq(1 — ga — g) = (1 — 9)’. 


Similar reasoning shows that if w = w,w2---W,» € S is any outcome consisting of k heads 
and n — k tails arranged in one particular specified order, then we should define p(w) = 
k n—k 
q’(l—q@)"™. 
Next, define P(A) = >°,,<4 p(w) for every event A C S. For example, let A, be the 
event that we get k heads and n—k tails in any order. Note that |A,| = |R(H*T”~*)| = ) 
by the Anagram Rule, and p(w) = q*(1 —q)"~* for each w € Ax. It follows that 


P(Ax) = D7 p(w) = (j)ta- ar. 


weAR 


We have not yet checked that >°,,<5 p(w) = 1, which is needed to prove that we have 
defined a legitimate probability measure (see Example 1.53). This fact can be deduced 
from the Binomial Theorem (discussed in §2.3), as follows. Since S$’ is the disjoint union of 
Ao, A, mare An, we have 


Ew = vw) = (Feta-ar. 
k=0 


wes k=0 wE A; 


By the Binomial Theorem 2.8, the right side is (¢ + [1 — q])” = 1” = 1, as needed. 
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1.10 Counting Functions 


This section counts various kinds of functions. We use the notation f : X — Y to mean 
that f is a function with domain X and codomain Y. This means that for each x in the 
set X, there exists exactly one y in the set Y with y = f(x). We can think of X as the set 
of all inputs to f, and Y as the set of all potential outputs for f. The set of outputs that 
actually occur as function values is called the image of f, denoted f[X] = {f(x):a © X}. 

When X and Y are finite sets, we can visualize f by an arrow diagram. We obtain this 
diagram by drawing a dot for each element of X and Y, and drawing an arrow from x to 
y whenever y = f(x). The definition of a function requires that each input « € X have 
exactly one arrow emanating from it, and the arrow must point to an element of Y. On the 
other hand, a potential output y € Y may have zero, one, or more than one arrow hitting 
it. Figure 1.1 displays the arrow diagrams for four functions f, g, h, and p. 


FIGURE 1.1 
Arrow diagrams for four functions f, g, h, and p. 


1.71. The Function Rule. Given a k-element domain X and an n-element codomain Y, 
there are n* functions f : X > Y. 


Proof. Let X = {21,%2,...,@~} and Y = {y1,y2,.-.,Yn}. To build a typical function 
f : X — Y, we choose the function values one at a time. First, choose f(x1) to be any of 
the elements in Y; this choice can be made in n; = n ways. Second, choose f (#2) to be any 
of the elements in Y; again there are no = n ways. Continue similarly; at the kth stage, we 
choose f(x,) to be any element of Y in nz, = n ways. The Product Rule shows that the 
number of functions we can build is njno:--n,p =n-n-... n=n*. Oo 


Next we study three special types of functions: injections, surjections, and bijections. 


1.72. Definition: Injections. A function f : X — Y is an injection iff for all u,v € X, 
u#v implies f(u) 4 f(v). Injective functions are also called one-to-one functions. 
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This definition says that different inputs must always map to different outputs when we 
apply an injective function. In the arrow diagram for an injective function, every y € Y has 
at most one arrow entering it. In Figure 1.1, g and p are one-to-one functions, but f and h 
are not. 


1.73. The Injection Rule. Given a k-element domain X and an n-element codomain Y 
with k <n, there are n!/(n — k)! injections f : X — Y. If k > n, there are no one-to-one 
functions from X to Y. 


Proof. First assume k < n. As above, we construct a typical injection f from X = 
{a1,...,2~} into Y by choosing the & function values f(a;), for 1 <i < k. At the ith 
stage, we choose f(x;) to be an element of Y distinct from the elements f(x1),..., f(ai-1) 
already chosen. Since the latter elements are pairwise distinct, we see that there are 
n—(i—1) = n—i+1 alternatives for f(x;), no matter what happened in the first i—1 choices. 
By the Product Rule, the number of injections is n(n — 1)---(n-—k+1)=n!/(n—k)!. 
On the other hand, suppose k > n. Try to build an injection f by choosing the values 
f(x1), f(x2),... as before. When we try to choose f(%+41), there are no elements of Y 
distinct from the previously chosen elements f(21),...,f(%n). Since it is impossible to 
complete the construction of f, there are no injections from X to Y in this case. O 


1.74. Definition: Surjections. A function f : X > Y is a surjection iff for every y © Y 
there exists « € X with y = f(x). Surjective functions are also said to map X onto Y; by 
an abuse of grammar, one sometimes says “f : X — Y is an onto function.” 


In the arrow diagram for a surjective function, every y € Y has at least one arrow 
entering it. For example, the functions h and p in Figure 1.1 are surjective, but f and g are 
not. To count surjections in general, we need techniques not yet introduced; see §2.13. For 
now, we look at one special case that can be solved with rules already available. See also 
Exercise 1-65. 


1.75. Example: Counting Surjections. How many functions f : {1,2,3,4,5,6} > 
{a,b,c,d,e} are surjective? Note that such a function is completely determined by the 
list of function values (f(1), f(2),..., f(6)). We require that all elements of the codomain 
appear in this list, so exactly one such element must appear twice. Build the list by choosing 
which letter appears twice (n; = 5 ways); then choosing a subset of two positions out of six 
for that letter (n2 = ($) = 15 ways), then filling the remaining positions from left to right 
with a permutation of the remaining letters (ng = 4! = 24 ways). The Product Rule gives 
1800 surjections. For instance, if we choose d, then {2,4}, then ebca, we build the surjection 
given by f(1) =e, f(2) =d, f(3) =, f(4) =d, f(5) =c, and f(6) = a. More generally, 
the same argument shows that there are n("5')(n — 1)! = (n + 1)!n/2 surjections mapping 
an (n+ 1)-element domain onto an n-element codomain. 


1.76. Definition: Bijections. A function f : X > Y is a bijection iff f is both injective 
and surjective. This means that for every y € Y, there exists a unique x € X with y = f(z). 
Bijective functions are also called one-to-one correspondences from X onto Y. 


In the arrow diagram for a bijective function, every y € Y has exactly one arrow entering 
it. For example, the function p in Figure 1.1 is a bijection, but the other three functions are 
not. The following theorem will help us count bijections. 


1.77. Theorem. Let X and Y be finite sets with the same number of elements. For all 
functions f : X — Y, the following conditions are equivalent: f is injective; f is surjective; 
f is bijective. 
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Proof. Suppose X and Y both have n elements, and write X = {x,...,%n}. First assume 
that f : X — Y is injective. Then the image f[X] = {f(x1),..., f(@n)} is a subset of Y 
consisting of n distinct elements. Since Y has n elements, this subset must be all of Y. This 
means that every y € Y has the form f(x;) for some x; € X, so that f is surjective. 

Next assume f : X — Y is surjective. We prove that f is a bijection by contradiction. 
If f were not a bijection, then f must not be one-to-one, which means there exist 7,7 with 
iA#j and f(x;) = f(x;). It follows that the set f[X] = {f(1),..., f(@n)} contains fewer 
than n elements, since the displayed list of members of f[X] contains at least one duplicate. 
Thus f[X] is a proper subset of Y. Letting y be any element of Y—f[X], we see that y does 
not have the form f(a) for any « € X. Therefore f is not surjective, which is a contradiction. 

Finally, if f is bijective, then f is injective by definition. O 


The previous result does not extend to infinite sets, as shown by the following examples. 
Let Zso = {1,2,3,...} be the set of positive integers. The function f : Zs9 > Zso defined 
by f(n) =n+1 for n € Zyo is injective but not surjective. The function g : Zs > Zso 
defined by g(2k) = g(2k—1) =k for all k € Zyo is surjective but not injective. The function 
exp : R > R defined by exp(x) = e” is injective but not surjective. The function h: RR 
defined by h(x) = x(x — 1)(a + 1) is surjective but not injective. 


1.78. The Bijection-Counting Rule. Suppose X is an n-element set and Y is an m- 
element set. If n = m, then there are n! bijections f : X — Y. Ifn #™m, then there are no 
bijections from X to Y. 


Proof. Suppose X has n elements and Y has m elements. If n = m, then bijections from 
X to Y are the same as injections from X to Y by the previous theorem, so the number of 
bijections is n! by the Injection Rule. 

Ifn < m, then the image of any f : X — Y has size at most n. So the image of f cannot 
be all of Y, which means there are no surjections (and hence no bijections) from X to Y. If 


n > m, then there must be a repeated value in the list f(a1), f(v2),..., f(@n), since each 
entry in this list comes from the m-element set Y. This means there are no injections (and 
hence no bijections) from X to Y. Oo 


1.79. Remark. Compare the Word Rule to the Function Rule, the Partial Permutation 
Rule to the Injection Rule, and the Permutation Rule to the Bijection-Counting Rule. You 
will notice that the same formulas appear in each pair of rules. This is not a coincidence. 
Indeed, we can formally define a word w,w2--- wr over an n-element alphabet A as the 
function w : {1,2,...,k} > A defined by w(t) = w;. The number of such words (functions) 
is n*. The word w ,wo--- wr is a partial permutation of A iff the w,;’s are all distinct iff 
w is an injective function. The word w,w2---w, is a permutation of A iff w is a bijective 
function. Finally, note that w is surjective iff every letter in the alphabet A occurs among 
the letters w1,..., Wk. 


DS 


1.11 Cardinality and the Bijection Rule 


So far we have been using the notation |S| = n informally as an abbreviation for the 
statement “the set S has n elements.” The number |S] is called the cardinality of S. We 
can give a more formal definition of this notation in terms of bijections. 


1.80. Definition: Cardinality. For all sets S and all integers n € Zo, |S| = n means 
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there exists a bijection f : S > {1,2,...,n}. For all sets A and B (possibly infinite), 
|A| = |B| means there exists a bijection f : A > B. 


If f: X — Y andg:Y — Z are functions, the composition of g and f is the function 
gof :X — Z defined by (go f)(x) = g(f(x)) for « € X. We state the following theorem 
without proof. 


1.81. Theorem: Properties of Bijections. Let X, Y, Z be any sets. (a) The identity 
map idx : X — X, defined by idx(#) = x for all 7 € X, is a bijection. Hence, |X| = |X]. 
(b) A function f : X — Y is bijective iff there exists a function g : Y + X such that 
go f =idx and fog = idy. If such a g exists, it is unique; we call it the two-sided inverse of 
f and denote it by f~'. This inverse is also a bijection, and (f~')~' = f. Hence, |X| = |Y| 
implies |Y| = |X]. (c) The composition of two bijections is a bijection. Hence, if |X| = |Y| 
and |Y| = |Z| then |X| = |Z]. 


It can also be shown that for all sets S and all n,m € Zso, if |S] = n and |S| =m 
then n = m, so the cardinality of a finite set is well-defined. This fact is equivalent to 
the assertion that there are no bijections from an n-element set to an m-element set when 
m #n. We gave an intuitive explanation of this statement in our proof of the Bijection- 
Counting Rule, but to give a rigorous formal proof, a delicate induction argument is needed 
(we omit this). In any case, this fact justifies the following rule, which is the foundation of 
bijective combinatorics. 


1.82. The Bijection Rule. If B is an n-element set and there is a bijection f : A > B or 
a bijection g: B > A, then |A| =n. 


Here are some initial examples to illustrate the Bijection Rule. 


1.83. Example: Subsets. Let A be the set of all subsets of {1,2,...,n}. We apply the 
Bijection Rule to show that |A| = 2” (we obtained the same result by a different method 
in the proof of the Power Set Rule 1.26). To use the Bijection Rule, we need another set B 
whose size is already known to be 2”. We take B to be {0,1}”, the set of all n-letter words 
in the alphabet {0,1}, which has size 2” by the Word Rule. Now we must define a bijection 
f:A-— B. An object S in the domain of f is a subset of {1,2,...,n}. We define f(S') = 
W1W2°+*Wn, where w; = 1ifi € S, and w; = 0 if i gS. For example, taking n = 5, we have 
f({1,3, 4}) = 10110, f({3,5}) = 00101, (0) = 00000, and f({1, 2,3, 4,5}) = 11111. 

To see that f is a bijection, we produce a two-sided inverse g: B — A for f. Given a 
word w = wiW2:-:W, € B, define g(w) = {i : w; = 1}, which is the subset of positions 
in the word that contain a 1. For example, taking n = 5, we have g(01110) = {2,3, 4}, 
g(00010) = {4}, and g(10111) = {1,3,4,5}. One sees immediately that g(f(S)) = S$ for all 
Se¢A, and f(g(w)) = w for all w € B, so g is the inverse of f. Thus f and g are bijections, 
so |A| = |B] = 2” by the Bijection Rule. 


1.84. Example: Increasing and Decreasing Words. A word w = w,w2-:-w, in the 
alphabet {1,2,...,n} is strictly increasing iff wy < wa < +--+ < wz; the word w is weakly 
increasing iff wy < wg <--: < weg. Strictly decreasing and weakly decreasing words are 
defined similarly. How many k-letter words in the alphabet X = {1,2,...,n} are: (a) strictly 
increasing; (b) strictly decreasing; (c) weakly increasing? 

Trying to build words by the Product Rule does not work here because the number of 
choices for a given position depends on what value was chosen for the preceding position. 
Instead, we use the Bijection Rule. Let A be the set of strictly increasing k-letter words in 
the alphabet X, and let B be the set of k-element subsets of X. We know |B| = (7) by the 
Subset Rule. Define f : A > B by setting f(wiwe--- we) = {wi, we,..., we}. For example, 
when k = 4 and n = 9, f(2358) = {2,3,5,8}. The inverse of f is the mapg: BoA 
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defined by letting g(S) be the list of the k elements in the subset S written in increasing 
order. For example, g({4, 9,7, 1}) = 1479 and g({8, 2,3,4}) = 2348. We have g(f(w)) = w 
for all w € A. Since presenting the elements of a set in a different order does not change 
the set, we also have f(g(S)) =S for all S € B. For example, f(g({3, 7,5, 1})) = f(1357) = 
{1,3,5, 7} = {3, 7,5, 1}. Thus g really is a two-sided inverse for f, proving that f and g are 
bijections. We conclude that |.A| = |B| = (7). 

To solve (b), let A’ be the set of strictly decreasing k-letter words in the alphabet X. 
The reversal map r: A — A’ given by r(wiwow3--: We) = We-+ W3W2W1 is a One-to-one 
map from A onto A’. By the Bijection Rule, |.A’| = |A] = (7). 

To solve (c), we need a more subtle bijection. Let C' be the set of weakly increasing 
k-letter words in the alphabet X. The map f from part (a) can no longer be used, since 
we cannot guarantee that the output will always be a subset of size k, and collisions may 
occur. For example, the sequence 1133 would map to the subset {1,1,3,3} = {1,3}, and 
the sequence 1333 also maps to {1,3}. We get around this problem by the following clever 
device. Let D be the set of all k-letter strictly increasing words in the expanded alphabet 
{1,2,...,.n,n+1,...,.n +k -— 1}. By part (a) with n replaced by n + k — 1, we know 
|D| = (ee Define a map h : C > D by setting h(wiwe-++ wr) = 2122°++ Ze, where 
2, = wi, t+i-1 for 1<i<k. In other words, h acts by adding 0 to wy, 1 to we, 2 to ws, 
and so on. For example, taking k = 4 and n = 6, h(1133) = 1256, h(1333) = 1456, and 
h(2356) = 2479. Since the input w € C satisfies 1 < wy < wo < w3 < +++ < we <n, we see 
that 1 << wy, +0<wetl<w3t2<---<wp+k—1<n+k-—1. This shows that the 
output word h(w) really does belong to the codomain D. To see that h is a bijection, we 
produce the inverse map h’ : D > C. Given u = ujug--- uz € D, define h’(u) = v1.02 +--+ Ug, 
where vj = u; — (i — 1) for 1 < i < k. For example, h’(2458) = 2335, h’(1379) = 1256, 
and h’(6789) = 6666. Since u satisfies 1 < uy < ug <---< up <n+k-—1, it follows that 
v satisfies 1 << vy < vo < +++ < ug <n, 80h’ really does map D into C. It is immediate 
that h’(h(w)) = w for all w € C and h(h’(u)) = u for all u € D, so h is a bijection with 
inverse h’. By the Bijection Rule, |C| = |D| = Ce a Applying a bijection that reverses 
ners) 

k 


words, we can also conclude that there are ( 
alphabet X. 


weakly decreasing k-letter words in the 


1.85. Remark: Proving Bijectivity. When using the Bijection Rule, we must check that 
a given formula or algorithm really does define a bijective function f from a set X to a set 
Y. This can be done in two ways. One way is to check that: (a) for each x € X, there is 
exactly one associated output f(x) (so that f is well-defined or single-valued); (b) for each 
x € X, f(x) does lie in the set Y (so that f maps into Y); (c) for all u,v € X, if f(u) = f(v) 
then u = v (so that f is injective); and (d) for all y € Y, there exists x € X with y = f(z) 
(so that f is surjective). 

The second way is to produce a two-sided inverse g for f. In this case, we must check that 
f is well-defined and maps into Y, that g is well-defined and maps into X, that g(f(x)) =x 
for all a € X, and that f(g(y)) = y for all y € Y. In simple situations, we usually check all 
these conditions by inspection without writing a detailed proof. However, for bijective proofs 
involving complex combinatorial objects, it is necessary to give these details. In particular, 
in part (c) of the example above, the most crucial point to check was that h and h’ both 
mapped into their claimed codomains. 
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1.12 Counting Multisets and Compositions 


Recall that the concepts of order and repetition play no role when deciding whether two 
sets are equal. For instance, {1,3,5} = {3,5, 1} = {1,1,1,5,5, 3,3} since all these sets have 
the same members. We now introduce the concept of a multiset, in which order still does 
not matter, but repetitions of a given element are significant. 


1.86. Definition: Multisets. A multiset is a pair M = (S,m), where S is a set and 
m:S —> Zso is a function. For « € S$, the number m/(z) is called the multiplicity of x in M 
or the number of copies of x in M.' The number of elements of M is |M| = >0.¢5 m(2). 


We often display a multiset as a list [v1,%2,...,2%], where each x € S' occurs exactly 
m(ax) times in the list. However, we must remember that the order of the elements in 
this list does not matter when deciding equality of multisets. For example, [1, 1,2,3,3] 4 
[t,4,. 1,2, 3] = [8,253,.1, 2,1). 


1.87. Notational Convention. We use square brackets for multisets (repetition matters, 
order does not). We use curly braces for sets (order and repetition do not matter). We use 
round parentheses or word notation for sequences (order and repetition do matter). For 
example, [w, 2,2, 2,z,2] is a multiset, {w,x,z} is a set, (v,z,w,2,w) is a sequence, and 
xzwew is a word. 


1.88. The Multiset Rule. The number of k-element multisets using letters from an n- 


letter alphabet is 
k+n-1\_ (k+n-—1)! 
kn-1/)  kin—1)! ° 


We give two proofs of this result. First Proof: Let X = (a1,2%2,...,%n) be a fixed 
(ordered) n-letter alphabet, and let U be the set of all k-element multisets using letters 
from X. Introduce the symbols * (star) and | (bar), and let V = R(x* |"~1) be the set of 

k -1 
all rearrangements of k& stars and n — 1 bars. By the Anagram Rule, |V| = ( as 7 i} It 
>a — 
therefore suffices to define a bijection f:U > V. 

Given a multiset M = (S,m) € U with S C X, extend m to a function defined on all 
of X by letting m(a;) = 0 if a; ¢ S. Then define f(M) to be the word consisting of m(x) 
stars, then a bar, then m(2) stars, then a bar, and so on; we end with m(a,) stars not 
followed by a bar. Using the notation x’ to denote a sequence of j stars, we can write 


f(M) = x™(w1)| i(@2) ee im(@n—1) | gi@(@n) 


Since M is a multiset of size k, the word f(M) contains k stars. Since there is no bar after 
the stars for «,, the word f(M) contains n — 1 bars. Thus f(M) does belong to the set V. 
For example, if X = (w,z,y,z) and k = 3, we have 


f([w, w, w]) = **>||], f({w, x, yl) = *|*|*l, f([z, x, x]) = |***|], Ff ([x, 2, 2]) = |*||*x. 


To see that f is a bijection, we define a map f’: V > U by letting f’(«7| «2 |--- |x”) 
be the unique multiset that has m; copies of «; for 1 < i < n (note each m; > 0). Since 
iim: = k, this is a k-element multiset using letters from X. For example, if n = 6, 
k=4, and X = (1,2,3,4,5,6), then f’(|| * || «| «*) = [8, 5,6, 6]. It is routine to check that 
f’ is the two-sided inverse of f. 

Second Proof: By replacing x; by 1, we may assume without loss of generality that the 
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alphabet X is (1,2,...,n). As above, let U be the set of all k-element multisets using 
letters from X. Let W be the set of weakly increasing sequences of elements of X having 
length k. We found |W| = Ce) in Example 1.84(c). By the Bijection Rule, it suffices 
to define a bijection g: W > U. Given w = (wi < wo <--+ < we) in W, let g(w) be the 
multiset [w1,w2,..., wz] in U. For example, taking n = 6 and k = 4, g(1135) = [1, 1,3, 5]. 
The inverse of g is the map g’ : U + W sending a multiset M € U to the sequence of 
elements of M (repeated according to their multiplicity) written in weakly increasing order. 
For example, g’([4, 2, 2, 4]) = 2244 and g’((5, 1,6, 1]) = 1156. We have g(g’(M)) = M for all 
M €U since order does not matter in a multiset. Similarly, g’(g(w)) = w for all w € W, so 
g is a bijection with inverse g’. 

We can also view the Multiset Rule as counting the number of solutions to a certain 
linear equation using variables ranging over Zo = {0,1,2,3,...}. 


1.89. The Integer Equation Rule. Let n > 0,k > 0 be fixed integers. (a) The number 
of sequences (21, 22,---;2n) With all z; € Z>9 and 21+ 22+-+--+2n =k is Ca (b) The 


number of sequences (y1, ¥2,---,Yn) with all y; € Zyo and yi + yo +++: + yn =k is Ca 


n-1 

Proof. (a) A particular solution (21, 22,...,2n) to the given equation corresponds to the 
k-element multiset M in the alphabet {1,2,...,n} where i occurs z; times for 1 <i <n. 
This correspondence is evidently a bijection, so part (a) follows from the Bijection Rule and 
the Multiset Rule. 

(b) Note that y: + y2 +-:-+Yn = k holds for given y; € Zso iff (yi — 1) + (yo —1) + 
--++ (Yn —1) = k—n holds with each y; — 1 € Zo. Setting z] = y: —1,...,2n = yn — 1, we 
are reduced to counting solutions to 2] +-+-+ 2%, =k-—n where all z; € Zso. By (a) with 


k replaced by k — n, the number of solutions to this equation is ({7"*"7') = (£7). Oo 


A variation of the last problem is to count positive integer solutions to yy +yo+:-:+Yn = 
k where n is not fixed in advance. In this context, the following terminology is used. 


1.90. Definition: Compositions. A composition of an integer k > 0 is a sequence a = 
(Q1,Q2,...,@s5) where each a; is a positive integer and ay +ag+-:-+a; =k. The number 
of parts of a is s. Let Comp(k) be the set of all compositions of k. 


1.91. Example. The sequences (1,3, 1,3,3) and (3,3,3,1,1) are two distinct compositions 
of 11 with five parts. The four compositions of 3 are (3), (2,1), (1,2), and (1,1, 1). 


1.92. The Composition Rule. (a) For all k > 0, there are 2*~! compositions of k. (b) 


For all k, s > 0, there are (*~}) compositions of k with s parts. 


Proof. For (a), we define a bijection g : Comp(k) > {0,1}*—1. Given a = (a1, Q2,...,Qs) € 
Comp(k), define 
g(a) =O 0A 1, 

Here, the notation 0/ denotes a sequence of 7 consecutive zeroes, and 0° denotes the empty 
word. For example, g((3, 1,3)) = 001100. Since $77_, (a; — 1) = k — s and there are s — 1 
ones, we see that g(a) € {0,1}*—!. Now define g’ : {0,1}*~' — Comp(k) as follows. 
We can uniquely write any word w € {0,1}*~! in the form w = 0°10°1---10°s where 
s > 1, each b; > 0, and 7}_,b; = (k — 1) — (s — 1) = k —s since there are s — 1 
ones. Define g’(w) = (b1 + 1,b2 +1,...,6; +1), which is a composition of k. For example, 
g'(100100) = (1,3,3). It can be checked that g’ is the two-sided inverse of g, so g is a 
bijection. By the Bijection Rule, | Comp(k)| = |{0,1}*~+| = 2*-1, 

Part (b) merely restates part (b) of the Integer Equation Rule. Alternatively, if we 
restrict the domain of g to the set of compositions of k with a given number s of parts, 
we get a bijection from this set onto the set of words R(0*~*1%—'), so part (b) also follows 
from the Anagram Rule. O 
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The bijections in the preceding proof are best understood pictorially. We represent an 
integer i > 0 as a sequence of 7 unit squares glued together. We visualize a composition 
(a1,...,Q@s) by drawing the squares for a1,...,@, in a single row, separated by gaps. For 
instance, the composition (1,3,1,3,3) is represented by the picture 


a ee 


We now scan the picture from left to right and record what happens between each two 
successive boxes. If the two boxes in question are glued together, we record a 0; if there is 
a gap between the two boxes, we record a 1. The composition of 11 pictured above maps 
to the word 1001100100 € {0,1}1°. Going the other way, the word 0101000011 € {0,1}!° 
leads first to the picture 


LT] LLJ CLETT) U U 


and then to the composition (2,2,5,1,1). It can be checked that the pictorial operations 
just described correspond precisely to the maps g and g’ in the proof above. When n = 3, 
we have: 

g((3)) = 00; g((2,1)) = 01; g((1,2))=10;  g((1,1,1)) = 11. 


1.13 Counting Balls in Boxes 


Many counting questions can be reduced to the problem of counting the number of ways to 
distribute & balls into n boxes. There are several variations of this problem, depending on 
whether the balls are labeled or unlabeled, whether the boxes are labeled or unlabeled, and 
whether we place restrictions on the number of balls in each box. The two most common 
restrictions require that each box have at most one ball, or that each box have at least one 
ball. In this section, we study versions of the problem where the boxes are labeled and the 
balls may or may not be labeled. Versions involving unlabeled boxes will be solved later, 
when we study integer partitions and set partitions (see §2.11 through §2.13). 


1.93. Counting Labeled Balls in Labeled Boxes. (a) The number of ways to put k 
balls labeled 1,2,...,k into n boxes labeled 1,2,...,n is n®. (b) If each box can contain at 
most one ball, the number of distributions is n!/(n—k)! for k <n and 0 for k > n. (c) If each 


box must contain at least one ball, the number of distributions is }7¥_9(—1)’ (") (n—j)*. 


Proof. When the balls and boxes are both labeled, we can encode the distribution of balls 
into boxes by a function f : {1,2,...,k} — {1,2,...,n}, by letting y = f(x) iff the 
ball labeled x goes into the box labeled y. So (a) follows from the Function Rule. The 
requirement in (b) that each box (potential output y) contains at most one ball translates 
into the requirement that the corresponding function be injective. So (b) follows from the 
Injection Rule. Similarly, the requirement in (c) that each box be nonempty translates 
into the condition that the corresponding function be surjective. The formula in (c) is the 
number of surjections from a k-element set onto an n-element set; we will prove this formula 
in §4.3. Oo 


1.94. Counting Unlabeled Balls in Labeled Boxes. (a) The number of ways to put 


k identical balls into n boxes labeled 1,2,...,n is Ce hs (b) If each box can contain at 


most one ball, the number of distributions is (a (c) If each box must contain at least one 
ball, the number of distributions is ony? 
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Proof. When the boxes are labeled but the balls are not, we can model a distribution of 
k balls into the n boxes by a multiset of size k using the alphabet {1,2,...,n}, where the 
multiset contains i copies of 7 iff the box labeled j contains 7 balls. So (a) follows from the 
Multiset Rule. The requirement in (b) that each box contains at most one ball means that 
the corresponding multiset contains at most one copy of each box label. In this case, the 
multiset is the same as an ordinary k-element subset of the n box labels. So (b) follows 
from the Subset Rule. The problem in (c) is the same as counting the number of solutions 
to yr + yo +:+++ Yn =k, where each y; is a positive integer giving the number of balls in 
box i. So (c) follows from part (b) of the Integer Equation Rule. Oo 


1.95. Example. How many ways can we give twelve identical cookies and four different 
baseball cards to seven kids if each kid must receive at least one cookie and at most one 
baseball card? We solve this with the Product Rule, first choosing who gets the cookies and 
then choosing who gets the baseball cards. The first choice can be modeled as a placement 
of 12 unlabeled balls (the cookies) in 7 labeled boxes (the kids) where each box must contain 
at least one ball. As seen above, there are (eae) = 462 ways to do this. The second choice 
can be modeled as a placement of 4 labeled balls (the baseball cards) in 7 labeled boxes 
(the kids) where each box contains at most one ball. There are 7!/(7 — 4)! = 840 ways to 
do this. So the answer is 462 - 840 = 388,080. 


1.96. Example. How many anagrams of TETRAGRAMMATON have no consecutive vowels? To 
begin the solution, let us first figure out how many valid patterns of consonants and vowels 
are possible. We can encode this pattern by a word in R(C?V°) with no two adjacent V’s; 
for instance, the original word has vowel-consonant pattern CVCCVCCVCCVCVC. Each 
such word has the form 
CVC”? VC" VC™4#VO*E VC", 

where 41 + %2 +-:-+2%6 = 9, 2 and xg are nonnegative integers, and 22,273,%4,25 are 
strictly positive integers. Setting y; = 21, y, = x; — 1 for i = 2,3,4,5, and yg = a6, the 
given equation in the 2,’s is equivalent to the equation yi + yo +--:+ ye = 5 with all 
yi € Z>o. By the Integer Equation Rule, the number of solutions is a) = 252. We 
could have also viewed this as counting the number of ways to put nine unlabeled balls (the 
consonants) into six labeled boxes (the spaces before and after the vowels) such that the 
four middle boxes must be nonempty. 

We finish the problem with the Product Rule and Anagram Rule. Build an anagram 
by first choosing a valid vowel-consonant pattern (252 ways); then replacing the C’s in the 
chosen pattern (from left to right) by an anagram in R(G'M?N'R?T*) in one of (gape) = 
15,120 ways; then replacing the V’s in the chosen pattern (from left to right) by an anagram 


in R(A°E'0') in one of (,) ,) = 20 ways. The final answer is 76,204,800. 


DT 


1.14 Counting Lattice Paths 


In this section, we give more illustrations of the Bijection Rule by counting combinatorial 
objects called lattice paths. 


1.97. Definition: Lattice Paths. A lattice path in the plane is a sequence 
P = ((0, Yo); (@1, Y1);-+-5 (Lk, Ye))s 


where the x;’s and y;’s are integers, and for each 7 > 1, either (xj, y;) = (wi-1 +1, yi-1) or 
(vi, Ys) = (@i-1, Yi-1 + 1). We say that P is a path from (20, yo) to (Xk, Yr). 
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We often take (29, yo) to be the origin (0,0). The sequence P encodes a path consisting of 
line segments of length 1 from (a;-1, yi-1) to (i, y:) for 1 < i < k. For example, Figure 1.2 
displays the ten lattice paths from (0,0) to (2,3). 


FIGURE 1.2 
Lattice paths from (0,0) to (2,3). 


1.98. The Lattice Path Rule. For all integers a,b > 0, there are (Cs) lattice paths from 
(0,0) to (a,b). 


Proof. We can encode a lattice path P from (0,0) to (a,b) as a word w € R(E“N’) by 
setting w; = E if (a, y;) = (ai-1 +1, yi_-1) and w; = N if (ai, ys) = (@i-1, yw-1 +1). Here, E 
stands for “east step,” and N stands for “north step.” Since the path ends at (a,b), w must 
have exactly a occurrences of E and exactly 6 occurrences of N. Thus we have a bijection 
between the given set of lattice paths and the set R(E°N?). Since |R(E*N’)| = (). the 
result follows from the Bijection Rule. O 


For example, the paths shown in Figure 1.2 are encoded by the words 


NNNEE, NNENE, NNEEN, NENNE, NENEN, 
NEENN, ENNNE, ENNEN, ENENN, EENNN. 


More generally, we can consider lattice paths in R¢. Such a path is a sequence of points 


(vp, U1,.--, Vk) in Z* such that for each i, vj = vj-1 + e; for some standard basis vector 
e; = (0,...,1,...,0) € R@ (the 1 occurs in position j). We can encode a path P from 
(0,...,0) to (m1,..., ma) by a word wiwe--: wn € Rey? --- e454), wheren =n1 +--+ +14 


and w; = e; iff vu; = 44-1 + ej. By the Bijection Rule and the Anagram Rule, the number 
ee), 


lattice points) and the word that encodes the lattice path. We now turn to a more difficult 
enumeration problem involving lattice paths. 


1.99. Definition: Dyck Paths. A Dyck path of order n is a lattice path from (0,0) to 
(n,n) such that y; > x; for all points (a;,y;) on the path. This requirement means that the 
path always stays weakly above the line y = x. For example, Figure 1.3 displays the five 
Dyck paths of order 3. 
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FIGURE 1.3 
Dyck paths of order 3. 


1.100. Definition: Catalan Numbers. For n> 0, the nth Catalan number is 


C= 1 2n\ 1 2n+1\ (2n)! fan 2n 
" n+1\nnJ 2nt+1\n41,n/- ni(n41)! \nn n+1,n—1) 


One may check that these expressions are all equal. For instance, 


i) Nee 2a) eee ar ee eta) 


The first few Catalan numbers are 


Co=1, Cr=1, Co=2, C3=5, Cr=14, C5 =42, Co =132, Cy = 429. 


1.101. The Dyck Path Rule. For all n > 0, the number of Dyck paths of order n is the 
Catalan number C,, = (27) — (,,7"_,). 


nn n+1,n-1 


Proof. We present a remarkable bijective proof due to D. André [3]. Let A be the set of 
all lattice paths from (0,0) to (n,n); let B be the set of all lattice paths from (0,0) to 
(n +1,n— 1); let C be the set of all Dyck paths of order n; and let D = A-—C be the set 
of paths from (0,0) to (n,n) that go strictly below the line y = x. Since C = A—D, the 
Difference Rule gives 


IC| = 14] - DI. 
We already know that |A| = (2”) and |B| = (, me " _,)- To establish the required formula 


|C| = Cy, it therefore suffices to exhibit a bijection r: D > B. 

We define r as follows. Given a path P € D, follow the path backwards from (n,n) 
until it goes below the diagonal y = x for the first time. Let (x;,y;) be the first lattice 
point we encounter that is below y = z; this point must lie on the line y = x — 1. P is 
the concatenation of two lattice paths P,; and P2, where P, goes from (0,0) to (#;, y;) and 
Pz goes from (2;,y;) to (n,n). By choice of 7, every lattice point of P: after (x;, y;) lies 
strictly above the line y = x — 1. Now, let Pj be the path from (a;,y;) to (n + 1,n— 1) 
obtained by reflecting P2 in the line y = x — 1. Define r(P) to be the concatenation of P; 
and P3. See Figure 1.4 for an example. Here, (2;, y;) = (7,6), P) = NEEENNENEEENN, 
P, = NNNEENE, and P; = EEENNEN. Note that r(P) is a lattice path from (0,0) to 
(n+1,n-—1), so r(P) € B. Furthermore, (x;, y;) is the only lattice point of P} lying on the 
line y=a—1. 

The inverse map r’ : B — D acts as follows. Given Q € B, choose i maximal such 
that (x;,y;) is a point of Q on the line y = x — 1. Such an 7 must exist, since there is 
no way for a lattice path to reach (n+ 1,n— 1) from (0,0) without passing through this 
line. Write Q = Q1Q2, where Qi goes from (0,0) to (ai, y;) and Q2 goes from (2;, y;) to 
(n+1,n—1). Let Q§ be the reflection of Q2 in the line y = «—1. Define r’(Q) = Q1Q4, and 
note that this is a lattice path from (0,0) to (n,n) which passes through (2;,y;), and hence 
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(n,n) ina) 


(n+1,n—1) 


e 
(n+1,n—1) 


FIGURE 1.4 
Example of the reflection map r. 


lies in D. See Figure 1.5 for an example. Here, (2;, y;) = (6,5), Q1 = NNENEEEENEN, 
Q2 = EEENENENN, and Q5 = NNNENENEE. From our observations about the point 
(x;,Y;) in this paragraph and the last, it follows that r’ is the two-sided inverse of r. oO 


The technique used in the preceding proof is called André’s Reflection Principle. Another 
proof of the Dyck Path Rule, which leads directly to the formula Pena is given in 


2n4+1 \n+1,n 
812.1. Yet another proof, which leads directly to the formula war (2%); is given in §12.2. 


DS 


1.15 Proofs of the Sum Rule and the Product Rule 


Many people accept the Sum Rule and the Product Rule as being intuitively evident. 
Nevertheless, it is possible to give formal proofs of these results using induction and the 
definition of cardinality. We sketch these proofs now. 

Step 1. Assume A and B are finite disjoint sets with |A| = n and |B] = m; we prove 
|AU B] = n+ m. The assumption |A| = n means that there is a bijection f : A > 
{1,2,...,n}. The assumption |B] = m means that there is a bijection g : B > {1,2,...,m}. 
Define a function h: AU B > {1,2,...,n +m} by setting 


_ f f(x) ifa € A; 
ne) -{ g(z)+n ifve B. 


Since A and B have no common elements, h is a well-defined (single-valued) function. 
Moreover, since f(x) € {1,2,...,n} and g(x) € {1,2,...,m}, h does map into the required 
codomain {1,2,...,n +m}. To see that h is a bijection, we display a two-sided inverse 
h’: {1,2,...,n+m}—> AUB. We define 


h'(i) = f-'() ie 
giGi-n) ifn+1<i<n+m. 
It is routine to verify that h’ is single-valued, h’ maps into the codomain AU B, and hoh’ 
and h’ oh are identity maps. 
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(n+1,n-1) e 
(n+1,n—1) 


FIGURE 1.5 
Example of the inverse reflection map. 


Step 2. We prove the Sum Rule, as formulated in 1.39, by induction on k > 1. Assume 
S1,...,S, are finite, pairwise disjoint sets with |S;| = m,; fori =1,...,k; we want to prove 
|S, U---U S| = m1 +---+ mx. This is immediate if k = 1; it follows from Step 1 if k = 2. 
For fixed k > 2, we may assume by induction that |S) U---U S,-1| = mi +--+ + mg_1 is 
already known. Now we apply Step 1 taking A = S,; U---US,_1 and B = Sx. Since S;, 
is disjoint from each of the sets S1,...,5,-1, B is disjoint from A. Then Step 1 and the 
induction hypothesis give 


[Sy U++-USp_1 U S| = |AU Bl = |A] + |B] = (my +--+ + mg_1) + mx. 


Step 3. Suppose $,...,S, are any finite sets with |S;| = n; for i = 1,...,k; we prove 
|Sy x Sg x +--+ x Spl = ning--+nz by induction on k > 1. The result is immediate for 
k = 1. For fixed k > 1, we may assume by induction that the set A = S, x --- x S,_y 
has size |A] = n1---nz—-1. Any set of the form A x {c} also has size n1---nz_1, since 
f(@1,---,p-1) = (@1,.--,@-1,¢) defines a bijection from A onto this set. Now S$; x --+ x 
Sp_-1 X Sp = Ax S;, is the union of ng pairwise disjoint sets of the form A x {c}, as c ranges 
over the nz elements in S;. Each of these sets has size n1---n,z_—1, so Step 2 shows that 


[Sy xX +++ xX Spl = np np Hee +a ++ Mp1 (np terms) = 11 -+ + Np -1Nk. 


Step 4. We prove the Product Rule 1.1 by formally restating the rule as follows. Suppose 
we build the objects in a set S by making a sequence of k choices, where there are n; possible 
choices at stage 7 for 7 = 1,2,...,4. We model the choices available at stage 7 by the n,;- 
element set S; = {1,2,...,n;}, and we model the full choice sequence by an ordered k-tuple 
(C1,---,Ck) € $1 x-+--x S,. The hypothesis of the Product Rule amounts to the assumption 
that there exists a bijection f : S, x --- x S, —- S. The bijection f tells us how to take 
a given list of choices (c1,...,cx) and build an object in S. (Note that how f uses the 
choice value c; is allowed to depend on the previous choices c,,...,¢;—1; but the number of 
possibilities for c; does not depend on previous choices. We usually specify f by an informal 
verbal description or an algorithm.) It follows from the Bijection Rule and Step 3 that 
|S| = n1--+n%, as needed. 


1.102. Remark. We can unravel the induction arguments above to obtain explicit bijective 
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proofs of the general Sum Rule (for & sets) and the general Product Rule. These bijections 
will help us create algorithms (called ranking and unranking maps) for listing and storing 
collections of combinatorial objects. We discuss such algorithms in Chapter 6. 


Summary 
We end each chapter by summarizing the main definitions and results from the chapter. 


e Notation. 
Factorials: 0! =1 and n!=n-(n—1)-...-3-2-1. 
Binomial Coefficients: (7 = FICEEoH for0O<k<n; (7) = 0 otherwise. 
Multinomial Coefficients: ( ii ) = —— ____ where n = Sa n,; and n; > 0. 


N1yyNk ni!ng!---n_!? 
Anagram Sets: R(a}' ---a;*) is the set of all words consisting of n; copies of aj. 
Types of Braces: (a1,...,@n) is a sequence (order and repetition matter); 
{ai,...,@n} is a set (order and repetition do not matter); 


[a1,...,@p] is a multiset (repetition matters, order does not). 


e Definitions of Combinatorial Objects. 
Words: sequences w w2:-:w x (order matters, repeated letters can occur). 
Permutations of A: sequences w,w2--+- Wy using each letter in A exactly once. 
Partial Permutations: sequences w1w2--+ wx (order matters, no repeated letters). 
Anagrams: words where each letter occurs a specified number of times. 
Sets: collections of objects where order and repetition do not matter. 
Multisets: collections of objects where repetition matters but order does not. 
Functions: f : X — Y means for all « € X there is a unique y € Y with y = f(z). 
Injections: functions such that unequal inputs map to unequal outputs. 
Surjections: functions such that every y in the codomain is f(x) for some x. 
Biyections: functions that are injective and surjective; functions with two-sided inverses. 
Compositions of k: sequences a = (a1,...,Qs) of positive integers summing to k. 
Lattice Paths: sequences of points in Z? joined by north and east steps. 
Dyck Paths: lattice paths from (0,0) to (n,n) not going below y = x. 


e Basic Counting Rules. 
The Product Rule: If every object in S can be constructed in exactly one way by making 
a sequence of k choices, and there are n; ways to make choice 7 no matter what choices 
were made earlier, then |S] = nyng--- nz. 


The Sum Rule: If every object in S belongs to exactly one of k non-overlapping cate- 
gories, where category i contains m,; objects, then |S] =m, + m2 +---+ mk. 


The Word Rule: There are n* word of length k using an n-letter alphabet. 
The Permutation Rule: There are n! permutations of an n-letter alphabet. 


The Partial Permutation Rule: For k between 0 and n, the number of k-letter partial 

permutations of an n-letter alphabet is n!/(n — k)!. 

The Power Set Rule: An n-element set has 2” subsets. 

The Subset Rule: The number of k-element subsets of an n-element set is Cy 
: n n n — (nytnet--+n 

The Anagram Rule: |R(a7t* ay? ---az*)| = ( A eee ae 


The Union Rule for Two Sets: When S and T are finite, |S UT| = |S|+|T|—|SNT). 
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The Disjoint Union Rule: If |S;| = m; for 1 < i < k and $; S; = 0 for i 4 j, then 
|S, U Sg U+++U Sp] = m1 + me +--+ +m. 


The Difference Rule: When S is finite, |S — U| = |.S|—|SQU|. 


The Cartesian Product Rule: If |S;| = nj for 1 <i< k, then |S; x Sp x +--+ x S;| = 
MyNg +++ NK. 

The Function Rule: When |X| =k and |Y| =n, there are n* functions f : X > Y. 
The Injection Rule: When |X| = k and |Y| = n > k, there are n!/(n — k)! injections 
fi: xX 7Y. 

The Surjection Rule: When |X| = & and |Y| = n, there are at (") (n — j)* 
surjections f : X + Y (see §4.3 for the proof). 

The Bijection-Counting Rule: If |X| =|Y| =n, there are n! bijections f : X > Y. 

The Bijection Rule: If there exists a bijection f : A B or g: B= A, then |A| = |B}. 
The Multiset Rule: The number of k-letter multisets using an n-letter alphabet is 


ue 
k,n-1/° 

The Integer Equation Rule: The number of nonnegative integer solutions to 21+---+2n = 
k is emear The number of positive integer solutions to y; +++: + yn =k is (*71). 
The Composition Rule: There are 2*—! compositions of k. There are a compositions 


of k with s parts. 

Rules for Labeled Balls in Labeled Boxes: There are n* ways to put k labeled balls in n 
labeled boxes. If at most one ball goes in each box, there are n!/(n — k)! distributions 
for k <n. If at least one ball goes in each box, there are ))/_9(—1)/ (") (n — 3)* distri- 
butions. 

Rules for Unlabeled Balls in Labeled Boxes: There are aur ways to put k identical 
balls in n labeled boxes. If at most one ball goes in each box, there are (o) distributions. 
If at least one ball goes in each box, there are Ce) distributions. 


The Lattice Path Rule: There are Cr lattice paths from (Zo, yo) to (% +a, yo + B). 


a,b 


The Dyck Path Rule: There are C,, = 7") Dyck paths from (0,0) to (n,n). 


e Probability Definitions. A sample space is the set S of outcomes for some random 
experiment. An event is a subset of the sample space. When all outcomes in S are equally 
likely, the probability of an event A is P(A) = |A|/|S|. The conditional probability of A 
given B is P(A|B) = P(ANB)/P(B), when P(B) > 0. Events A and B are independent 
iff P(AN B) = P(A)P(B) 


(I 
Exercises 


1-1. (a) How many seven-digit phone numbers are possible? (b) How many such phone 
numbers have no repeated digit? (c) How many seven-digit phone numbers do not start 
with 0, 1, 911, 411, or 555? 

1-2. A key for the DES encryption system is a word of length 56 in the alphabet {0,1}. 
A key for a permutation cipher is a permutation of the 26-letter English alphabet. Which 
encryption system has more keys? 
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1-3. A key for the AES encryption system is a word of length 128 in the alphabet {0,1}. 
Suppose we try to decrypt an AES message by exhaustively trying every possible key. 
Assume six billion computers are running in parallel, where each computer can test one 
trillion keys per second. Estimate the number of years required for this attack to search the 
entire space of keys. 


1-4. Solve Example 1.8 using only the Product Rule. 


1-5. (a) How many four-digit numbers consist of distinct even digits? (b) How many five- 
digit numbers contain only odd digits with at least one repeated digit? 


1-6. How many four-digit numbers start with an odd digit, are divisible by 5, and do not 
contain the digits 6 or 7? 


1-7. How many four-digit numbers consist of four consecutive digits in some order (e.g., 
3214 or 9678 or 1023)? 


1-8. How many four-digit even numbers contain the digit 5 but not the digit 2? 


1-9. Find the minimum k such that every printable character on a standard computer 
keyboard can be encoded by a distinct word in {0,1} of length exactly k. Does the answer 
change if we allow nonempty words of length at most k? 


1-10. (a) How many four-letter words w using an n-letter alphabet satisfy w; 4 wi+1 for 
¢ = 1,2,3? (b) How many of the words in (a) also satisfy w4 4 w1? 

1-11. Solve Example 1.11 again with the additional condition that all the letters in the 
word must be distinct. 


1-12. How many n-letter words in the alphabet {A, B,...,7Z} contain: (a) only vowels; 
(b) no vowels; (c) at least one vowel; (d) alternating vowels and consonants; (e) two vowels 
and n — 2 consonants? 


1-13. (a) How many license plates consist of three uppercase letters followed by three 
digits? (b) How many license plates consist of three letters and three digits in some order? 
(c) How many license plates consist of seven characters (letters or digits), where all the 
letters precede all the digits and there must be at least one letter and at least one digit? 


1-14. (a) How many six-digit numbers contain the digit 8 but not 9? (b) How many six-digit 
numbers contain the digit 0 or 1? 


1-15. A pizza shop offers ten toppings. How many pizzas can be ordered with: (a) three dif- 
ferent toppings; (b) up to three different toppings; (c) three toppings, with repeats allowed; 
(d) four different toppings, but pepperoni and sausage cannot be ordered together? 


1-16. (a) How many numbers between 1 and 1000 are divisible by 5 or 7? (b) How many 
such numbers are divisible by 5 or 7, but not both? 


1-17. How many three-digit numbers: (a) do not contain the digits 5 or 7; (b) contain the 
digits 5 and 7; (c) contain the digits 5 or 7; (d) contain exactly one of the digits 5 or 7? 


1-18. Count the number of five-digit integers x having each property. (a) All digits of x are 
distinct. (b) The digit 8 appears and x is divisible by 5. (c) x is odd with distinct digits. 
(d) x is even with distinct digits. (e) x starts with 7 and is even. (f) x starts with 7 or is 
even. (g) 2 contains the digit 3 but not the digits 4 or 5. (h) x contains consecutive digits 
686. (i) Any digit 6 in x is immediately followed by the digit 7. (j) 2 contains the digit 2 
and the digit 9. (k) x contains the digit 6 exactly three times. (1) 2 contains distinct digits 
in increasing order. 


1-19. Let S be the set of eight-letter words in the alphabet {A, B,...,Z}. (a) Find the size 
of S. (b) How many words in S' have all letters distinct? (c) How many words in S have no 
two adjacent letters equal? (d) How many words in S begin with a vowel and end with a 
consonant? (e) How many words in S$ begin with a vowel or end with a consonant, but not 
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both? (f) How many words in S' are rearrangements of the word LOOPHOLE? (g) How many 
words in S' have no two consecutive vowels and no two consecutive consonants? (h) How 
many words in S are palindromes? (i) How many words in S start with three vowels and end 
with five consonants? (j) How many words in S$ contain three vowels and five consonants 
in some order? (k) How many words in S$ contain four consecutive letters MATH (in this 
order) somewhere in the word? 


1-20. A class consists of nine women (including Alice and Carla) and eleven men (including 
Bob and Dave). We want to form a committee of six members of the class. Count the 
number of committees satisfying each condition below. (a) Any six people can be on the 
committee. (b) The committee must have four men and two women. (c) The committee 
must have more men than women. (d) Alice and Carla must both be on the committee. 
(e) Bob or Dave must be on the committee. (f) Exactly two of Alice, Bob, Carla, and Dave 
must be on the committee. 


1-21. Consider a business employing fifteen lawyers (including Jones), ten salesmen (includ- 
ing Smith), and eight clerks (including Reed). Find the number of committees satisfying 
each condition. (a) The size of the committee is five or six. (b) The committee consists of 
three lawyers, two salesmen, and one clerk. (c) The committee has six members, at least 
three of whom are lawyers. (d) The committee has seven members, including Reed and 
Jones but excluding Smith. (ec) The committee has four lawyers (including Jones) and three 
salesmen (excluding Smith). (f) The committee has six members, including Jones or Smith. 
(g) The committee has six members, including at least one lawyer, one salesman, and one 
clerk. 


1-22. How many ways can four husbands and wives be seated in a row of ten seats with 
each restriction below? (a) The two end seats must be empty; (b) there are no restrictions; 
(c) all the men sit to the left of all the women; (d) Bob does not sit immediately next to 
anyone; (e) men and women are seated alternately (ignoring empty seats); (f) each husband 
and wife sit immediately next to each other. 


1-23. Suppose |A| = m, |B| =n, and |AU B| = p. Find |(A x B) — (Bx A)|. 
1-24. (a) How many anagrams of MISSISSIPPI are there? (b) How many of these anagrams 


begin and end with P? (c) In how many of these anagrams are the two P’s adjacent? (d) In 
how many of these anagrams are no two I’s adjacent? 


1-25. (a) How many straight flush poker hands are there? (b) How many five-card poker 
hands are straights or flushes? 


1-26. A two-pair poker hand is a hand with three different values, two of which occur twice; 
an example is {20, 2@, 5, 5&, KO}. Someone tries to count these hands by the following 
Product Rule argument: choose the value of the first pair (n1 = 13 ways); choose two suits 
out of four for this pair (ng = es) = 6 ways); choose the value of the second pair (n3 = 12 
ways); choose two suits out of four for this pair (n4 = (3) = 6 ways); choose the value for 
the last card (ns = 11 ways); choose the suit for the last card (ng = 4 ways); the answer is 
13-6-12-6-11-4 = 247,104. Explain exactly what is wrong with this counting argument, 
illustrating using the sample hand above. Then find the correct answer. 


1-27. What is wrong with the following argument for counting one-pair poker hands? Choose 
any card (ny = 52 ways); then choose another card of the same value to make the pair 
(nz = 3 ways); then choose three more cards of different values (nz = 48 ways, n4 = 44 
ways, m5 = 40 ways); the answer is 13,178,880. Also explain why this answer is an integer 
multiple of the true answer. 


1-28. (a) How many five-card hands have at least one card of every suit? (b) How many 
six-card hands have at least one card of every suit? 
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1-29. For 1 <k <5, find the number of ways to place k non-attacking rooks on the board 
shown below. 


1-30. How many ways can we place k non-attacking rooks on an n x n chessboard? 


1-31. (a) Verify the claim in the Introduction that for each & > 1, there are the same 
number of ways to place k non-attacking rooks on each of the two boards below. 


(b) Show that the board with row lengths (6, 6, 2, 2) is also rook-equivalent to the two boards 
above. 


1-32. Consider a board with left-justified rows of lengths aj > ag >--- > adm. Find the 
number of ways to place k non-attacking rooks on this board for k = 1, k = m, and 
k=m-1. 

1-33. A DNA strand can be modeled as a word in the alphabet {A,C,G,T}. (a) How many 
DNA strands have length 12? (b) How many strands in (a) are palindromes? (c) How many 
strands of length 12 are there if we do not distinguish between a strand and its reversal? 
(d) Define the complement of a DNA strand to be the strand obtained by interchanging all 
A’s and T’s and interchanging all C’s and G’s. How many strands of length 12 are there if 
we do not distinguish between a strand and its complement? 

1-34. Repeat the preceding question considering only strands consisting of four A’s, four 
T’s, two C’s, and two G’s. 


1-35. (a) How many n-letter words using a k-letter alphabet are palindromes? (b) How 
many n-digit integers are palindromes? (c) How many words in R(aj!a5? ---a;,") are palin- 
dromes? 


1-36. (a) For a fixed k < n, count the number of permutations w of {1,2,...,n} such that 
Wy < We <+++ < We > Wey < Wep2 <0 < Wn. (1.3) 


(b) How many permutations satisfy (1.3) for some & € {1,2,...,n— 1}? 

1-37. A relation from X to Y is any subset of X x Y. Suppose X has n elements and Y 
has k elements. (a) How many relations from X to Y are there? (b) How many relations 
R satisfy the following property: for each y € Y, there exists at most one x € X with 
(x,y) € R? 

1-38. For any sets S and T, define SAT = (S—T)U(L—S), so x € SAT iff x belongs to 
exactly one of S and T. Prove: for all finte sets S and T, |SAT| = |S| + |T| —2|S NT]. 
1-39. Use the Union Rule for Two Sets (see 1.41) to prove the Union Rule for Three Sets: 
for all finite sets A, B,C, 


|AUBUC|=|A\/4+ |B] +|C|—|AN B) -|ANC|—-|BNC|+|ANBNC|. 


1-40. How many positive integers less than 10,000 are not divisible by 7, 11, or 13? (Use 
the previous exercise.) 
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1-41. Two fair dice are rolled. Find the probability that: (a) the same number appears on 
both dice; (b) the sum of the numbers rolled is 8; (c) the sum of the numbers rolled is 
divisible by 3; (d) the two numbers rolled differ by 1. 


1-42. In blackjack, you have been dealt two cards from a shuffled 52-card deck: 99 and 
6. Find the probability that drawing one more card will cause the sum of the three card 
values to go over 21. (Here, an ace counts as 1 and other face cards count as 10.) 


1-43. Find the probability that a random 5-letter word in {A, B,..., Z}: (a) has no repeated 
letters; (b) contains no vowels; (c) is a palindrome. 


1-44. A company employs ten men (one of whom is Bob) and eight women (one of whom 
is Alice). A four-person committee is randomly chosen. Find the probability that the com- 
mittee: (a) consists of all men; (b) consists of two men and two women; (c) does not have 
both Alice and Bob as members. 


1-45. A fair coin is tossed ten times. (a) Find the probability of getting exactly seven heads. 
(b) Find the probability of getting at least two heads. (c) Find the probability of getting 
exactly seven heads, given that the number of heads was prime. 


1-46. A fair die is rolled ten times. What is the probability that, in these ten tosses, 1 
comes up five times, 3 comes up two times, and 6 comes up three times? 


1-47. We draw 10 balls (without replacement) from an urn containing 40 red, 30 blue, and 
30 white balls. (a) What is the probability that no blue balls are drawn? (b) What is the 
probability of getting 4 red, 3 blue, and 3 white balls? (c) What is the probability that all 
10 balls have the same color? (d) Answer the same questions assuming the balls are drawn 
with replacement. 


1-48. A sequence of four cards is dealt from a 52-card deck (order matters). (a) What is 
the size of the sample space? (b) Find the probability of getting no clubs. (c) Find the 
probability of getting all four cards of the same value. (d) Find the probability that the 
first card is red and the last card is a spade. (e) Find the probability of getting one card 
from each suit. (f) Find the conditional probability that the last card is an ace given that 
the first two cards are not aces. 


1-49. A sequence of four cards is dealt from a 52-card deck (order matters). (a) What is the 
size of the sample space? (b) Find the probability that your hand has at least one king but 
no diamonds. (c) Given that the hand contains 39 and 69, find the probability of a flush. 
(d) Given that the hand contains 39 and 69, find the probability of a straight. (e) Given 
that the hand contains 8 and 84, find the probability of a four-of-a-kind. (f) Given that 
the hand contains 8& and 8@, find the probability of a full house. 


1-50. A license plate consists of three uppercase letters followed by four digits. A random 
license plate is chosen. (a) Find the probability that the plate has three distinct letters and 
four distinct digits. (b) Find the probability that the plate has a Z but not an 8. (c) Find 
the probability that the plate has an 8 or does not have a Z. (d) Find the probability that 
the plate is a rearrangement of JZB-3553. (e) Find the probability that the plate starts with 
ACE and ends with four distinct digits in increasing order. 

1-51. Let P be a probability measure. (a) Prove: for all events A and B, P(A — B) = 
P(A)—P(ANB). (b) Prove: for all events A and B, P(AUB) = P(A)+ P(B)— P(ANB). 
1-52. A 5-card poker hand is dealt from a 52-card deck. (a) What is the probability that 
the hand contains only red cards (i.e., hearts and diamonds)? (b) What is the probability 
that the hand contains exactly two 8’s? (c) What is the probability that the hand contains 
only numerical cards (i.e., ace, jack, queen, and king may not appear)? 


1-53. (a) Find the probability of getting a three-of-a-kind poker hand (this is a hand with 
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three different values, one of which occurs three times; an example is {40 4@,40, 70, Q@}). 
(b) Find the probability of getting a three-of-a-kind hand containing three fours. 


1-54. A bad poker hand is a hand with five different non-consecutive values and at least two 
suits; an example is {20,30,49,59,7}. (a) Use the Product Rule and Difference Rule 
to enumerate bad poker hands. (b) Check your answer by subtracting the number of good 
poker hands (found in the text and earlier exercises) from the total number of hands. 


1-55. Consider the three urns from Example 1.69. An urn is selected at random, and then 
one ball is selected from that urn. What is the probability that: (a) the ball is blue, given 
that urn 2 was chosen; (b) the ball is blue; (c) urn 2 was chosen, given that the ball was 
blue? 


1-56. A fair coin is tossed three times. (a) Describe the sample space. (b) Consider the 
following events. A: second toss is tails; B: second and third tosses disagree; C: all tosses 
are the same; D: the number of heads is even. Describe each event as a subset of the sample 
space. (c) Which pairs of events from { A, B, C, D} are independent? (d) Is the list of events 
A, B, D independent? Explain. 

1-57. Describe the sample space of the urn experiment in Example 1.69. Find the probability 
of each outcome in the sample space, and verify that these probabilities add to 1. 


1-58. A 5-card poker hand is dealt from a 52-card deck. (a) What is the probability that 
the hand is a flush, given that the hand contains no clubs? (b) What is the probability that 
the hand contains at least one card from each of the four suits, given that the hand contains 
both a red and a black card? (c) What is the probability of getting a two-pair hand, given 
that at least two cards in the hand have the same value? 

1-59. Let A, B,C be events in a probability space S. Assume A and C are independent, and 
B and C are independent. (a) Give an example where AU B and C are not independent. 
(b) Prove that AU B and C are independent if A and B are disjoint. (c) Must AN B and 
C be independent? Explain. 

1-60. (a) How many n-letter words using the alphabet {0,1} contain both the symbols 0 
and 1? (b) How many n-letter multisets in the alphabet {0,1} contain both the symbols 0 
and 1? 

1-61. How many lattice paths from (0,0) to (7,5) pass through the point (2,3)? 

1-62. (a) How many five-digit numbers have strictly decreasing digits reading left to right? 
(b) How many five-digit numbers have weakly increasing digits reading left to right? 
1-63. A two-to-one function is a function f : X — Y such that for every y € Y, there exist 
exactly two elements 41,22 € X with f(a1) = y = f(#2). How many two-to-one functions 
are there from a 2n-element set to an n-element set? 

1-64. Suppose we try to adapt the argument before Rule 1.27 to count k-element multisets 
from an n-element alphabet, as follows. Choose the first element of the multiset in n ways, 
choose the second element in n ways, and so on; then divide by k! since order does not 
matter in a multiset. The answer n*/k! cannot be correct because it is not even an integer 
in general. Explain exactly what is wrong with this counting argument. 

1-65. Count the number of surjections from an n-element set to an m-element set for each 
of these special cases: (a) m = 1; (b) m = 2; (c) n=m; (d) n< m; (e) n=m-+2. 

1-66. List all 5-letter words in {0,1} that do not contain two consecutive zeroes. 

1-67. List all permutations w,waw3w4 of {1,2,3,4} where w; 4 7 for all 7. 

1-68. List all words in R(x?y?z7) with no two consecutive y’s. 

1-69. List all bijections f : {1,2,3,4} — {1, 2,3,4} such that fo f is the identity map. 
1-70. List all bijections g : {1,2,3,4} — {1,2,3,4} where gt =gog. 
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1-71. List all surjections g : {1,2,3,4} — {a, b}. 

1-72. List all injections h : {a,b,c} > {1,2,3,4,5,6,7} such that the image of h does not 
contain two consecutive integers. 

1-73. List all compositions of 4. 

1-74. List all compositions of 7 with exactly three parts. 

1-75. List all lattice paths from (0,0) to (4, 2). 

1-76. List all Dyck paths of order 4. 

1-77. List all three-element multisets using the alphabet {a, b, c}. 

1-78. List all subsets of the three-element set {{1, 2}, (3, 4), [1, 1}. 


1-79. Draw pictures of all compositions of 5. For each composition, determine the associated 
word in {0,1}* constructed in the proof of the Composition Rule. 


1-80. How many lattice paths start at (0,0) and end on the line x + y =n? 
1-81. Let r be the bijection in the proof of the Dyck Path Rule. Compute 


r(NNEEEENNNNEEEENN) and r~'(NENEENNEEENEEENN). 


1-82. Draw all the non-Dyck lattice paths from (0,0) to (3,3) and compute their images 
under the reflection map r from the proof of the Dyck Path Rule. 


1-83. Ten lollipops are to be distributed to four children. All lollipops of the same color are 
considered identical. How many distributions are possible if: (a) all lollipops are red; (b) all 
lollipops have different colors; (c) there are four red and six blue lollipops? (d) What are 
the answers if each child must receive at least one lollipop? 

1-84. A monomial in N variables is a term of the form xf'x9? --- 24% , where each d; > 0. 
The degree of this monomial is dj + dz +---+dy. How many monomials in N variables 
have degree (a) exactly d; (b) at most d? 

1-85. How many multisets (of any size) can be formed from an n-letter alphabet if each 
letter can appear at most & times in the multiset? 

1-86. Find a bijection on Comp(n) that maps compositions with k parts to compositions 
with n+ 1-—k parts for all k. 

1-87. (a) What is the probability that a random lattice path from (0,0) to (n,n) is a Dyck 
path? (b) What is the probability that a random Dyck path of order n only touches y = x 
at (0,0) and (n,n)? (c) What is the probability that a lattice path from (0,0) to (n,n) is a 
Dyck path, given that the path starts with a north step or ends with an east step? 

1-88. How many 7-element subsets of {1,2,...,20} contain no two consecutive integers? 


1-89. (a) How many points with positive integer coordinates lie on the plane x+ y+ 2 = 8? 
(b) How many points with nonnegative integer coordinates lie on this plane? 


1-90. Count the integer solutions to 7; +.%2+--:+ 25 = 23 if we require x; > 1% for each 2. 
1-91. (a) For fixed n and k, how many integer sequences (#1, 22,...,@n) satisfy vy +a_4+- +--+ 
Ln = k with all x; even and nonnegative? (b) How many sequences satisfy 71 +---+a@, =k 
with all x; odd and positive? 

1-92. How many solutions to 7] +2%2+--+-+ a7 =k are there if we require all 7; € Zo, £1 
is even, and x2 € {0,1}? 

1-93. How many words w1---wz in the alphabet {1,2,...,n} satisfy w; > wii + 2 for 
4=2,3,...,k? 

1-94. (a) How many ways can we put & unlabeled balls in n labeled boxes if not all balls 
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need be used? (b) Repeat (a) assuming each box must contain at least one ball. (c) Repeat 
(a) using labeled balls. 
1-95. Let A = {1,2,3,4}, B = {u,v,w,2,y,z}, and define f : A > B by f(1) = v, 
f(2) =2, f(3) =z, f(4) =u. How many functions g: B > A satisfy go f = ida? 
1-96. Counting Left Inverses. Given an injective function f : X — Y where |X| =n 
and |Y| = m, count the functions g: Y + X such that go f = idx. 
1-97. Let A = {1,2,3,4,5,6,7}, B = {x,y,z}, and define f : A > B by f(1) = f(3) = 
f(4) = y, f(2) = f(5) = z, f(6) = f(7) = z. How many functions g : B > A satisfy 
fog=idp? 
1-98. Counting Right Inverses. Given a surjective function f : X — Y between finite 
sets, how many functions g: Y + X satisfy f og = idy? (The answer depends on f, not 
just on the sizes of X and Y.) 

e1 C2 


1-99. Given a positive integer n, let the prime factorization of n be n = p{' ps? --- p;*, where 
each e; > 0 and the p; are distinct primes. How many positive divisors does n have? How 
many divisors does n have in Z? 

1-100. Let the prime factorization of n! be p{!p5?---p;*. Prove that e; = SP, |n/p*]. 
(The notation |x| denotes the greatest integer not exceeding the real number x.) Hence 
determine the number of trailing zeroes in the decimal notation for 100!. 


1-101. A password is a word using an alphabet containing 26 uppercase letters, 26 lower- 
case letters, 10 digits, and 30 special characters. (a) Compare the number of 5-character 
passwords with the number of 12-character passwords. (b) Find the least m for which there 
are more m-letter passwords using only uppercase letters than 7-letter passwords with no 
special characters. (c) How many 8-character passwords contain at least one digit and at 
least one special character? (d) How many 8-character passwords contain a letter, a digit, 
and a special character? (Use the Union Rule for Three Sets.) 


1-102. (a) How many 10-letter words in the alphabet {A, B,...,Z} are such that every Q 
in the word is immediately followed by U? [Hint: Create categories based on the number of 
Q’s in the word.] (b) What is the answer for n-letter words? 


1-103. Consider an alphabet with A consonants and B vowels. (a) How many words in this 
alphabet consist of m consonants and n vowels with no two consecutive vowels? (b) Repeat 
(a) if all letters in the word must be distinct. 


1-104. How many 15-digit numbers have the property that any two zero digits are separated 
by at least three nonzero digits? 


1-105. (a) How many integers between 1 and 1,000,000 contain the digit 7? (b) If we write 
the integers from 1 to 1,000,000, how often will we write the digit 7? (c) What are the 
answers to (a) and (b) if 7 is replaced by 0? 

1-106. For fixed k,n,d € Zso, how many k-element subsets S' of {1,2,...,n} are such that 
any two distinct elements of S differ by at least d? 


1-107. How many positive integers x are such that x and «+3 are both four-digit numbers 
with no repeated digits? 


1-108. Properties of Injections. Prove the following statements about injective functions. 
(a) If f:X > Y andg:Y — Z are injective, then go f is injective. (b) If go f is injective, 
then f is injective but g may not be. (c) f : X — Y is injective iff for all sets W and all 
gh:W->X, fog=foh implies g =h. 

1-109. Properties of Surjections. Prove the following statements about surjective func- 
tions. (a) If f: X > Y and g:Y — Z are surjective, then go f is surjective. (b) If go f is 
surjective, then g is surjective but f may not be. (c) f : X > Y is surjective iff for all sets 
Zandallg,h:Y —~ Z,gof=hof implies g = h. 
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1-110. Sorting by Comparisons. Consider a game in which player 1 picks a permutation 
w of n letters, and player 2 must determine w by asking player 1 a sequence of yes or no 
questions. (Player 2 can choose later questions in the sequence based on the answers to 
earlier questions.) Let K(n) be the minimum number such that, no matter what w player 
1 chooses, player 2 can correctly identify w after at most K(n) questions. (a) Prove that 
(n/2) log,(n/2) < [logs (n!)] < K(n). (b) Prove that K(n) = [log,(n!)] for n < 5. (c) Prove 
that (b) still holds if we restrict player 2 to ask only questions of the form “is w; < w;?” at 
each stage. (d) What can you conclude about the length of time needed to sort n distinct 
elements using an algorithm that makes decisions by comparing two data elements at a 
time? 

1-111. (a) You are given 12 seemingly identical coins and a balance scale. One coin is 
counterfeit and is either lighter or heavier than the others. Describe a strategy that can be 
used to identify which coin is fake in only three weighings. (b) If there are 13 coins, can the 
fake coin always be found in three weighings? Justify your answer. (c) If there are N coins 
(one of which is fake), derive a lower bound for the number of weighings required to find 
the fake coin. 


1-112. Suppose we randomly draw a 5-card hand from a 51-card deck where the queen of 
spades has been removed. Find the probability of the following hands: (a) four-of-a-kind; 
(b) three-of-a-kind; (c) full house; (d) straight; (e) flush; (f) two-pair; (g) one-pair. 

1-113. Suppose we mix together two 52-card decks and randomly draw a 5-card hand. Find 
the probability of the following hands: (a) four-of-a-kind; (b) three-of-a-kind; (c) full house; 
(d) straight; (e) flush; (f) two-pair; (g) one-pair; (h) five-of-a-kind, defined to be a hand 
where all five values are the same (e.g., five kings). 

1-114. Suppose we randomly draw a 5-card hand from a 48-card deck where the four 
jacks have been removed. Find the probability of the following hands: (a) four-of-a-kind; 
(b) three-of-a-kind; (c) full house; (d) straight; (e) flush; (f) two-pair; (g) one-pair; (h) a 
hand containing at least two diamonds. 

1-115. Suppose we randomly draw a 5-card hand from a 49-card deck where 89, 80, and 8& 
have been removed. Find the probability of the following hands: (a) four-of-a-kind; (b) three- 
of-a-kind; (c) full house; (d) straight; (e) flush; (f) two-pair; (g) one-pair; (h) three-of-a-kind, 
given that 8@ is in the hand. 

1-116. Suppose we randomly draw a 5-card hand from a 53-card deck containing a joker 
card. The joker can have any suit and any value to make the hand as good as possible. Find 
the probability of the following hands: (a) four-of-a-kind; (b) three-of-a-kind; (c) full house; 
(d) straight; (e) flush; (f) two-pair; (g) one-pair; (h) five-of-a-kind. 

1-117. Let A be the event that a five-card poker hand contains the ace of spades. (a) Find 
the conditional probability of the various poker hands given A. Which of these events are 
independent of A? (b) With minimum additional calculation, answer (a) again taking A to 
be the event that your hand contains the six of spades. 


1-118. Texas Hold ’em. In a popular version of poker, a player is dealt an ordered sequence 
(C1, Co,...,C7z) of seven distinct cards from a 52-card deck. The last five cards in this 
sequence are community cards shared with other players. In this exercise we concentrate on 
a single player, so we ignore this aspect of the game. The player uses these seven cards to 
form the best possible five-card poker hand. For example, if the seven-card sequence was 
(49, 7, 30, Ido, 5d, Ge, Qa), we would have a flush (the five club cards) since this beats 
the straight (3,4,5,6,7 of various suits). (a) Compute the size of the sample space. (b) What 
is the probability of getting four-of-a-kind? (c) What is the probability of getting a flush? 
(d) What is the probability of getting four-of-a-kind, given C; = 39 and C2 = 3@? (e) What 
is the probability of getting a flush, given Ci = 5% and C2 = 9? 
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1-119. You have been dealt the poker hand {Ad&, AO, AV, 3@,79} from a 52-card deck. 
You can now discard k cards and receive k new cards. Which is more likely: getting a full 
house or four-of-a-kind by discarding the 3, or getting a full house or four-of-a-kind by 
discarding the 3 and the 7? 


1-120. What is the probability that a random lattice path from (0,0) to (n,n) has exactly 
one north step below y = x? 

1-121. Define f : Zso x Zs0 > Zyo by f(a,b) = 2°(2b4+ 1). Prove that f is a bijection. 
1-122. Define f : Zyo x Z>0 > Zso by f(a,b) = ((a+ b)? + 3a +b)/2. Prove that f is a 
bijection. 

1-123. Euler’s ¢ Function. For each n > 1, let ®(n) be the set of integers k between 
1 and n such that gcd(k,n) = 1, and let ¢(n) = |®(n)|. (a) Compute G(n) and ¢(n) for 
1<n< 12. (b) Compute ¢(p) for p prime. (c) Compute ¢(p*) for p prime and e > 1. (The 
next exercise shows how to compute (n) for any n.) 

1-124. Chinese Remainder Theorem. In this exercise, we write “a mod k” to denote 


the unique integer b in the range {1,2,...,k} such that & divides (a — b). Suppose m and 
n are fixed positive integers. Define a map 


f :{1,2,...,mn}— {1,2,...,m} x {1,2,...,n} by setting f(z) = (z mod m, z mod n). 


(a) Show that f(z) = f(w) iff lem(m,n) divides z — w. (b) Show that f is injective iff 
gcd(m,n) = 1. (c) Deduce that f is a bijection iff gcd(m,n) = 1. (d) Prove that for 
gcd(m,n) = 1, f maps ®(mn) bijectively onto ®(m) x ®(n), and hence ¢(mn) = ¢(m)¢(n). 
(See the previous exercise for the definition of ® and ¢.) (e) Suppose n has prime factor- 
ization pj’ ---p;*. Prove that ¢(n) = nye 4(1 —1/p;). 

1-125. The Bijective Product Rule. For any positive integers m,n, define 


g: {0,1,...,m—1} x {0,1,...,n—1} > {0,1,...,mn—- 1} 


by setting g(t, 7) = ni + j. Carefully prove that g is a bijection. 


1-126. Bijective Laws of Algebra. (a) For all sets X,Y, Z, prove that X UY =YUX, 
(XUY)UZ=xXU(Y UZ), and X U9 = X = QUX. (b) For all sets X,Y, Z, define 
bijections f: X x Y > Y x X,g:(X x Y)x Z—> X x (Y x Z), and (for Y, Z disjoint) 
h:X x (YUZ) > (X x Y)U(X x Z). (c) Use (a), (b), and counting rules to deduce the 
algebraic laws x+y = yta, (a@+y)+2=a4+(y+2),cv+0=2=0449, zy = yz, 
(xy)z = x(yz), and #(y +z) = xy 4+ xz, valid for all integers x,y,z > 0. 

1-127. The Bijective Laws of Exponents. Let Fun(A, B) denote the set of all func- 
tions f : A — B. (a) Given sets X,Y,Z with YN Z = @, define a bijection from 
Fun(Y U Z,X) to Fun(Y, X) x Fun(Z, X). (b) If X,Y, Z are any sets, define a bijection 
from Fun(Z, (Fun(Y, X))) to Fun(Y x Z, X). (c) By specializing to finite sets, deduce the 
laws of exponents 29+? = x¥x* and (x¥)* = x for all integers x, y, z > 0. 

1-128. Let X be any set (possibly infinite). Prove that every f : X — P(X) is not surjective. 
(Given f, show that S = {1 € X: a ¢ f(x)} € P(X) is not in the image of f.) Conclude 
that |X| 4 |P(X)|. 

1-129. Let X be the set of infinite sequences w = (wz : k > 0) with each wz € {0,1}. Show 
there is no bijection f : Z>9 + X. (Hint: See the previous exercise.) 


1-130. A sample space S consists of 25 equally likely outcomes. Suppose we randomly 
choose an ordered pair (A,B) of events in S. (a) Find the probability that A and B are 
disjoint. (b) Find the probability that A and B are independent events. 
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Notes 


Some general references on combinatorics include [1, 9, 12, 15, 19, 20, 24, 53, 107, 110, 
121, 125, 128]. Detailed treatments of probability are given in texts such as [10, 27, 61, 88]. 
More information on set theory, including a discussion of cardinality for infinite sets, may 
be found in [59, 63, 90]. 
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Combinatorial Identities and Recursions 


Suppose we are proving an identity of the form a = b, where a and 6 are formulas that 
may involve factorials, binomial coefficients, powers, summations, etc. One way to prove 
such an identity is to give an algebraic proof using tools like the Binomial Theorem, proof 
by induction, or other algebraic techniques. This chapter introduces another powerful and 
elegant method of proving identities called a combinatorial proof. A combinatorial proof 
establishes the equality of two formulas by exhibiting a set of objects whose cardinality is 
given by both formulas. There are three main steps in a combinatorial proof of a formula 
a = b. First, define an appropriate set S of objects; second, give a counting argument showing 
that |.S| = a; third, give a different counting argument showing that || = b. We illustrate 
this technique by giving combinatorial proofs of the Binomial Theorem, the Multinomial 
Theorem, the Geometric Series Formula, and other binomial coefficient identities. 

The second part of the chapter studies combinatorial objects including set partitions, 
integer partitions, equivalence relations, surjections, ballot paths, pattern-avoiding permu- 
tations, and rook placements. To count these objects, we need the idea of combinatorial 
recursions. A recursion lets us compute a whole sequence of quantities by relating later 
values in the sequence to earlier values; the Fibonacci recursion fp, = fn—1 + fn—2 is one 
famous example. We will see that recursions derived from counting arguments allow us to 
enumerate complicated combinatorial collections whose cardinalities may not be given by a 
simple closed formula. We also present a method for automatically finding exact solutions 
to certain types of recursions, including the Fibonacci recursion. 


ee 


2.1 Initial Examples of Combinatorial Proofs 


In this section, we give initial illustrations of the combinatorial proof technique by proving 
four binomial coefficient identities. Each proof follows the outline discussed in the intro- 
duction to this chapter: first we introduce a set of objects; then we count this set in two 
different ways, leading to the two formulas on each side of the identity to be proved. Our 
first sample identity shows how to evaluate a sum of binomial coefficients. 


2.1. Theorem: Sums of Binomial Coefficients. For all n € Zso, 


Y= 2, 
> (i) 
Proof. Fix n € Zso. Step 1. We define S to be the set of all subsets of {1,2,...,n}. Step 
2. By the Power Set Rule, we know that |S| = 2”. (This rule motivated our choice of S in 
Step 1.) Step 3. Now we count S in a different way. For 0 < k < n, let S; be the set of all 
k-element subsets of {1,2,...,n}. On one hand, the Subset Rule tells us that |S;| = (7) 
for each k. On the other hand, S is the disjoint union of the sets So,...,S,. So the Sum 
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Rule tells us that |S] = S7¢_9 |S%| = po (Z)- Comparing Steps 2 and 3, we obtain the 
required formula. Oo 


In our next example, we give an algebraic proof, a combinatorial proof, and a bijective 
proof of the identity in question. 


2.2. Theorem: Symmetry of Binomial Coefficients. For all integers n,k with 0 < 


k <n, we have 
n\ _ n 
kk) \n-k/J 


Algebraic Proof. To prove the identity algebraically, we give a chain of known equalities 
leading from one side of the identity to the other. Specifically, by definition of binomial 
coefficients, we have 


(.-) ~ SBIR SE > Roe > ae |G): 


Combinatorial Proof. Step 1. We let S = R(0*1"—-*) be the set of rearrangements of k zeroes 
and n—k ones. Step 2. To build a word in S,, choose a set of k positions out of n available 
positions to contain the zeroes, and fill the remaining positions with ones. By the Subset 
Rule, we see that |S| = (2). Step 8. Alternatively, we can build a word in S' by choosing 
a set of n — k positions out of n positions to contain the ones, then filling the remaining 
positions with zeroes. By the Subset Rule, we see that |.$| = (," 4) 

Buective Proof. Here is a variation of the basic combinatorial proof template where the 
final step involves a bijection between two sets of objects. Let S be the set of all k-element 
subsets of {1,2,...,n}, and let T be the set of all (n — k)-element subsets of {1,2,...,n}. 
By the Subset Rule, we know |S| = (7) and |T| = (,,",). To prove (7) = (,,",), it therefore 
suffices to define a bijection f : S > T. For A € S, define f(A) = {1,2,...,n}—A. Since 
A has size k, f(A) has size n — k, so f does map S into the codomain T. Similarly, define 
g:T->S by g(B) = {1,2,...,n}—B for B € T. Since B has size n — k, g(B) has size 
n—(n—k) =k and therefore is in the codomain S. One sees immediately that go f = idg 
and f og = idr, using the set-theoretic fact that X—(X — A) = A whenever A C X. Thus, 
f and g are bijections. 


Our next identity expresses the binomial coefficient ) as a sum of two other binomial 
coefficients. This identity is connected to Pascal’s Triangle, which we discuss in §2.7. 


2.3. Pascal’s Binomial Coefficient Identity. For all n € Zsyo and all k € Z, 


n\ _ (n-1 4 n-1 

kj} k k-1)° 
Algebraic Proof. If k < 0 or k > n, both sides of the identity are zero by definition. If k = 0, 
the left side is (}) = 1 and the right side is ra +0=1.Ifk=n, the left side is (”) =1 


and the right side is 0+ aes = 1. For the rest of the proof, assume 0 < k < n. Expanding 
the right side of the identity, we find that 


n—-1 . m=—1\ __ (#—1)! nm (n — 1)! 
k k-1)  ké(n-—1-—k)! (k-1)'(n—k 
To produce a common denominator, we multiply the first fraction by (n — k)/(n — k) and 
the second fraction by k/k, obtaining 


(n—1)"n—k)  (n—-IA)lk — (n-1)"n—k+k) _ n! _[{n 
feat MG ~~ ino - naan (;): 
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We have now transformed the right side of the identity into the left side. 


Combinatorial Proof. As in the algebraic proof, it suffices to consider the case 0 < k <n. 
Step 1. We define S to be the set of all k-element subsets of {1,2,...,n}. Step 2. By 
the Subset Rule, we know that |S| = ("). Step 3. To count S in a new way, we write 
S = $,U So, where S$; is the collection of subsets in S' that do not contain n, and 5S» is 
the collection of subsets in S that do contain n. Symbolically, S; = {A € S:n¢ A} and 
Sg ={BeS:ne€ B}. Since S; and S2 are disjoint, the Sum Rule says that |.$| = |.$1|+|S9l- 
To count S;, note that a k-element subset of {1,2,...,n} that does not contain n is the 
same as a k-element subset of {1,2,..., — 1}. Thus, |$i| = ("7") by the Subset Rule. 
We can build each object in $2 by picking an arbitrary subset of {1,2,...,n— 1} of size 
k—1 (in (771) ways), then adding n to this subset (in 1 way) to get a k-element subset of 


{1,2,...,n} containing n. Thus, |S2| = (771), so [S| = ("7") + (72). 


2.4. Theorem: Sums of Squared Binomial Coefficients. For all n > 0, 


Lay" 2n 
2G) 
Proof. We give a combinatorial proof of the equivalent identity eee oe emit Ug a 
which we have replaced one copy of ('') by (,,”,) (using Theorem 2.2). Step 1. Define S to 
be the set of all n-element subsets of X = {1,2,...,2n}. Step 2. By the Subset Rule, we 
know that |S| = (°”) (this fact motivated our choice of S in Step 1). Step 3. We count S 
in a new way, based on the degree of overlap of S with the subsets X; = {1,2,...,n} and 
Xo ={n+1,...,2n}. For0<k <n, define 


Sp={AES:|ANX1| =k and |AN Xq| =n- k}. 


S is the disjoint union of the S;’s, so that |S] = >; |Sx| by the Sum Rule. To compute 
|S;,|, we build a typical object A € S; by making two choices. First, choose the k-element 
subset AMX] in any of (2) ways. Second, choose the (n—k)-element subset AMX in any of 
(,,",,) ways. We see that |S,| = (7) (,,",) by the Product Rule. Thus, |S] = 7; (7) (74): 
completing the proof. 


2.2 The Geometric Series Formula 


A geometric series is a sum of terms where each term arises from the preceding one by 
multiplying by a fixed ratio r. The following formulas show how to evaluate a finite or 
infinite geometric series. 


2.5. The Geometric Series Formula. (a) For all a € R, n € Zso, and r € R—-{1}, 


n . = n+1 

St 
1l-r 

k=0 


(b) For all a € R and all r € (—1,1), DP, ar*® =a/(1—1). 


Proof. To prove part (a) algebraically, fix a, n, and r 4 1, and define 


n 
S= Yo art =atar+ar? +ar? +--+ +ar"! +ar". 
k=0 


54 Combinatorics, Second Edition 


Multiplying both sides of this equation by r, we get rS = artar?+---tar"™—!+ar"+ar"t}, 
Subtracting the new equation from the old one gives (1 — r)S =a—ar"*! = a(1—r"*4), 
since all the middle terms cancel. Dividing by 1 — r gives the formula in part (a). For part 
(b), note that —1 <r < 1 implies lim, 7r” = 0. Taking the limit of both sides of (a) as 
n goes to infinity, we obtain the formula in (b). Oo 


When r = 1, we cannot divide by 1 — r. In this case, however, it is immediate that 
reo ar*® = pp @ = a(n +1), since we are summing up n+ 1 copies of a. The Geometric 
Series Formula also holds for complex values of a and r (with the same proof), as long as 
we restrict to r # 1 in (a) and |r| < 1 in (b). 


2.6. Example. What rational number is represented by the infinite repeating decimal 
x = 4.1373737 ---? We have x = 4.14+37(10-3+10-°+1077+---) = 4.140.037 7°, 107". 
Applying the Geometric Series Formula with r = 10~? = 1/100, the sum here is 1/(1 — 
0.01) = 100/99. Thus, « = 41/10 + (37/1000)(100/99) = 41/10 + 37/990 = 2048/495. 


Now we give a combinatorial proof of the finite Geometric Series Formula. Here we take 
a =1 and fix positive integers r and n. Multiplying both sides of the identity by —(1 —1r), 
we are reduced to proving 


(r-1)L+rtr?tee-tr") srt — 1, (ay 


Let S be the set of words in the alphabet {0,1,...,7 — 1} of length n + 1 excluding the 
word 00---0. On one hand, the Word Rule shows that |S] =r”*+!' — 1. On the other hand, 
we can classify words in S based on the position of the first nonzero symbol. For 0 < 7 < n, 
let S; be the set of words w in S where w; =--: = w; = 0 and wj41 4 0. To build a word 
in S;, put zeroes in the first 7 positions (1 way), then choose a nonzero symbol w;+1 (r—-1 
ways), then choose the remaining n + 1— (j +1) =n—j symbols arbitrarily (r ways each). 
By the Product Rule, |$;| = (r — 1)r"~J. S is the disjoint union of So,...,S,. So the Sum 
Rule gives |S] = 0 |Sj| = (r—1)r"+(r—1)r"71 +---+(r—1)r°, completing the proof. 
When r = 10, this proof is classifying positive integers between 1 and 10"! — 1 based on 
the number of digits in the decimal representation of the integer (excluding leading zeroes, 
as usual). 


2.7. Remark. It may seem that the combinatorial proof yields a weaker result than the 
algebraic proof, since we had to assume that r was a positive integer (rather than a real 
number) in the combinatorial proof. But it turns out that if a polynomial identity of the 
form p(r) = q(r) holds for infinitely many values of r (say for all positive integers), then 
this identity must automatically hold for all real numbers r. To see why, rewrite the given 
identity as (p — q)(r) = 0, where p — q is a polynomial in the formal variable r. If p — q¢ 
were nonzero, say of degree d, then p — q would have at most d real roots. Since we are 
assuming the equation holds for infinitely many values of r, we conclude that p—q is the zero 
polynomial. Thus, p(r) = q(r) must hold for every r. A similar result holds for polynomial 
identities involving multiple parameters. 


DS 


2.3. The Binomial Theorem 


The Binomial Theorem is a famous formula for expanding the nth power of a sum of two 
terms. Binomial coefficients are so named because of their appearance in this theorem. 
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2.8. The Binomial Theorem. For all z,y € R and all n € Zso, 


(a+y)" = 3 (fev a 


For example, 
(a + y)* = ly? + 4ay? + 6x? y? + 4a3y + 1e*. 


We now give a combinatorial proof of the Binomial Theorem under the additional assump- 
tion that x and y are positive integers. (Since both sides of the formula are polynomials in 
x and y, we can deduce the general case using the result mentioned in Remark 2.7.) Let 
A={Vi,...,V2,C1,...,Cy} be an alphabet consisting of x + y letters, where A consists 
of x vowels Vj,...,Vz and y consonants C},...,Cy. Let S be the set of all n-letter words 
using the alphabet A. By the Word Rule, |S| = (a + y)”. On the other hand, we can clas- 
sify words in S based on how many vowels they contain. For 0 < k < n, let Sz be the 
set of words in S' that contain exactly k vowels (and hence n — & consonants). To build 
a word in Sx, first choose a set of k positions out of n where the vowels will appear ces) 
ways, by the Subset Rule); then fill these positions with a sequence of vowels (x* ways, 
by the Word Rule); then fill the remaining positions with a sequence of consonants (y”~* 
ways, by the Word Rule). By the Product Rule, |S,| = (%)2*y"—*. By the Sum Rule, 
[S| = Vio [Sel = Chao (ety. 

We now give an algebraic ae at the Binomial Theorem to illustrate the method of proof 
by induction. We hope the reader finds the combinatorial proof to be more illuminating and 
elegant than the computations that follow. Fix x,y € R. The base case for the induction 
proof occurs when n = 0. For this choice of n, the formula to be proved is (2 + y)° = 
(})x°y°°. Using the convention that r° = 1 for all real r (even r = 0) and noting that 
(3) = 0!/(0!0!) = 1, we see that both sides of the formula to be proved evaluate to 1. 

For the induction step, fix n € Z>9. We may assume that the Binomial Theorem is 
already known for this fixed value of n, and we must then prove the corresponding formula 
with n replaced by n+ 1. Let us write the induction hypothesis in the form (a# + y)” = 
enez (2 )ax kyn—k where we now sum over all integers k instead of just k = 0,1,...,n. This 
is nem since (7) = 0 fork < 0 ork > n. Using the distributive law, we can now 
compute 


(e+! = cmeean nr ky" 


keZ 


-EQewez (jon 


keZ 


In the first sum, replace the summation index k by k —1. This is allowed, since k — 1 ranges 
over all integers as k ranges over all integers. We get 


EOS er EO -EI65)- Olen 


Finally, use Pascal’s Identity 2.3 (with n replaced by n + 1) to replace the sum in square 
brackets by ("{'). We have now shown (1+y)"*! = yeg ("f*)aky*!-*, which completes 
the induction step and the proof. 


2.9. Example. What is the coefficient of 2’ in (2z—5)°? We apply the Binomial Theorem 
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taking « = 2z and y = —5 and n = 9. We have 
(22-5)? = 3 6 (2z)*(—5)°*. 
k=0 : 


The only summand involving z’ is the k = 7 summand. The corresponding coefficient is 
(7)27(—5)? = 115, 200. 


2.10. Remark. If r is any complex number and z is a complex number such that |z| < 1, 
there exists a power series expansion for (1+ z)" called the Extended Binomial Theorem. 
This power series is discussed in §5.3. 


2.4 The Multinomial Theorem 


Just as binomial coefficients appear in the expansion of the nth power of a binomial, multi- 
nomial coefficients appear when we expand the nth power of a multinomial. 


2.11. The Multinomial Theorem. For all s,n € Zo and all z1,...,z; € R, 


(atest tag = S- ( n 2 yee a 
’ s 


N1,N2,... 
nitn2+--+ns=n Aya 


The summation here extends over all ordered sequences (n1,n2,...,Ns) of nonnegative 
integers that sum to n. 


For example, 
(e+ytzP% ax? ty? +27 + Bey + 3a22 + By? 2 + Bay? + 3x2? + Byz? + Exyz. 


There is a combinatorial proof of the Multinomial Theorem similar to our proof of the 
Binomial Theorem. The idea is to count n-letter words using an alphabet with s types of 
letters, where there are z, letters of type 1, 22 letters of type 2, and so on. Alternatively, 
the theorem can be proved algebraically using induction on n and the analogue of Pascal’s 
Identity for multinomial coefficients. Yet another algebraic proof uses the Binomial Theorem 
and induction on s. We ask the reader to supply details of these proofs in the exercises. 
In the rest of this section, we show how the Multinomial Theorem may be deduced from a 
more general result called the Non-Commutative Multinomial Theorem. 

Before stating this new result, let us consider an example. Suppose A, B, and C aremxm 
real matrices and we are trying to compute (A+ B+C)”. Recall that matrix multiplication 
is not commutative in general, but matrices do obey other laws such as associativity of 
addition and multiplication, commutativity of addition, and the distributive law. Using 
these facts, we first compute 


(A+B+C) = (A+B+C)(A+B+C) 
A(A+B4+C)+B(A+B+C)+C(A+B+C) 
AA+AB+4+AC+4+ BA+BB+BC+CA+CB+CC. 


Here we have written AA instead of A”, for reasons that will become clear in a moment. 
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Looking at the third power, we find 


(A+B+C) = (A+B+C)(A+B+C) 
A+B+C)(AA+AB+AC+---+CC) 

= A(AA+AB+4+ AC+---+CC)+ B(AA+ AB+ AC +---+ CC) 
+C(AA+ AB+ AC +--+ +CC) 

= AAA+AAB+ AAC +---+ ACC+ BAA+ BAB+ BAC+-:: 


BCC +CAA+CAB+4+ CAC +---+ CCC. 


I 


The final formula is the sum of all three-letter words in the alphabet {A,B,C}. If we 
continued on to (A+ B+C)4 =(A+B+4+C)(A+B+C)?, the distributive law would take 
the new factor A+ B+ C and prepend each of the three letters A, B, C to each of the 
three-letter words formed at the previous stage. Thus, we see that (A+ B+ C)* is the sum 
of all four-letter words in {A, B, C}. Generalizing this argument, we are led to the following 
Non-Commutative Multinomial Theorem. 


2.12. The Non-Commutative Multinomial Theorem. For all s,n € Zso and all 
m xm real matrices Z1,..., Zs, 


Crass SS Dene G).: 


Informally, the right side is the sum of all n-letter words in the alphabet Z,..., Zs. 


Proof. We use induction on n. When n = 0, both sides of the identity evaluate to [,,, the 
m xm identity matrix (provided we interpret an empty product as J,,,). If the reader wishes 
to exclude the n = 0 case, the n = 1 case may also be verified directly by noting that both 
sides evaluate to Z; +---+Z,. For the induction step, fix n > 0, assume the formula in the 
theorem holds for n, and prove the formula holds for n + 1. We compute 


(Zy + Zo 4-6 + ZZ)! = (A+ Zo+---+Z.)"(A +--+ Zs) 
~ Si. BiB SZ, S°Z; 
wE€{l,...,s}” j=l 
= Si Beg? Ln, By 


ee ee 


The first equality follows by definition of powers. The second equality holds by the induction 
hypothesis. The third equality uses the distributive law for matrices. The fourth equality 
arises by replacing the pair (w,7) by the word v = wy ---wyj. oO 


We still need to show how the original Multinomial Theorem follows from the non- 
commutative version. We begin with an example: given variables w, x,y, z representing real 
numbers, what is the coefficient of w>xty?z3 in the expansion of (w+ x + y+ z)9? Right 
now, we know that this expansion is the sum of all nine-letter words using the alphabet 
{w, x,y,z}. Words such as wwwryyz2z, wxryzwyzwz, or zzyywwwaz each contribute one 
copy of the monomial w*x!y?z? to the full expansion. In general, any word containing 
exactly three w’s, one x, two y’s, and three z’s (in some order) gives one copy of the 
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monomial w?z'y?z3. Here we are using the fact that the real variables w, x, y, z commute 


with each other under multiplication. We see that the overall coefficient of w>x!y?z in the 


expansion is |R(w?z'y?z?)|, the number of rearrangements of wwwxyyzzz. By the Anagram 
Rule, this coefficient is (, ,°, ) = 5040. 
The argument in this example extends to the general case. Given commuting variables 


Z1,--+,%s, consider what happens to the right side of the multinomial formula 


(27 +--+ +25)" = S- Bit Rae 


Ns 


when we collect together all terms that yield a particular monomial zj'1z5?--- 2s, where 
necessarily nj + +--+, = n. The number of times this monomial occurs in the original 
sum is the number of rearrangements of n; ones, nz twos, and so on. Thus, by the Anagram 
Rule, the coefficient of this monomial in the simplified sum is ( % ). This completes 


N1,N2,.05Ns 


the proof of the commutative version of the Multinomial Theorem. 


2.13. Remark. The Non-Commutative Multinomial Theorem is valid for elements 
Z,...,4Zs5 in any ring (see the Appendix for the definition of a ring). The commutative 
version of the formula holds whenever Z7;Z; = Z;2; for all i,7. The Non-Commutative 
Multinomial Theorem is a special case of the Generalized Distributive Law, discussed in 
Exercise 2-16. 


DT 


2.5 More Binomial Coefficient Identities 


Lattice paths can often be used to give elegant, visually appealing combinatorial proofs of 
identities involving binomial coefficients. We give several examples of such proofs in this 
section. Our first result is a version of Pascal’s Identity phrased in terms of multinomial 
coefficients. 


2.14. Lattice Path Version of Pascal’s Identity. For all a,b € Zso, 


a+b a+b-1 a+b-1 
(“*,) = bere ee 
Proof. Fix a,b > 0. Let S be the set of lattice paths from (0,0) to (a,b). By the Lattice Path 
Rule, |S| = eraP On the other hand, we can write S = S,US», where S; is the set of paths 
in S ending with an east step, and S$» is the set of paths in S ending with a north step. By 
the Sum Rule, |.S'| = |.$1| + |S2|. We can build a typical path in S, by choosing any lattice 
path from (0,0) to (a — 1,b) and then appending an east step; by the Lattice Path Rule, 


|S1| = (Gy We can build a typical path in Sz by choosing any lattice path from (0,0) 


to (a,b — 1) and then appending a north step; by the Lattice Path Rule, |S2| = are 


Thus, || = eee 1) + Gey as needed. We remark that the identity also holds if one 
(but not both) of a or 6 is zero, since both sides of the identity are 1 in that case. O 


The key idea of the preceding proof was to classify a lattice path based on the last step 
in the path. We can prove more elaborate binomial coefficient identities by classifying lattice 
paths in more clever ways. For instance, let us reprove the identity for the sum of squares 
of binomial coefficients using lattice paths. 
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2.15. Lattice Path Proof of 57), Cy = (°”). Let S be the set of all lattice paths from 
(0,0) to (n,n). By the Lattice Path Rule, |S| = (2") = (?"). For0 <k <n, let Sx be 
the set of all paths in S passing through the point (k, n—k) on the line x + y = n. Every 
path in S must go through exactly one such point for some k between 0 and n, so S is the 
disjoint union of So, .51,..., Sn. See Figure 2.1. To build a path in S%, first choose a path 
from (0,0) to (k,n—k) in any of (, ."_,) = (2) ways. Second, choose a path from (k,n —k) 
to (n,n). This is a path ina rectangle of width n — k and height n — (n— k) = k, so there 


are (_". ,) = ({,) ways to make this second choice. By the Sum Rule and Product Rule, we 


conclude that ” 7 " 
nm 
IS|= Sots => (5) - 
k=0 


k=0 


(n,n) 


(0,0) 


FIGURE 2.1 
A combinatorial proof using lattice paths. 


The next identity arises by classifying lattice paths based on the number of east steps 
following the final north step in the path. 


2.16. Theorem. For all integers a > 0 and b> 1, 


a+b “(k+b-1 
a ver 


Proof. Let S' be the set of all lattice paths from the origin to (a, b). We know that |S] = Cy 


For 0 < k <a, let S; be the set of paths in S such that the last north step of the path 
lies on the line x = k. See Figure 2.2. We can build a path in 5S; by choosing any lattice 
path from the origin to (k,b— 1) in (a) ways, and then appending one north step and 
a—k east steps. Thus, the required identity follows from the Sum Rule. If we classify the 
paths by the final east step instead, we obtain the dual identity Cy) = ys Ca ,; 
valid for all a > 1 and b > 0. This identity also follows from the previous one by the known 


symmetry Cs = C: Oo 


We can obtain a more general identity by classifying lattice paths based on the location 
of a particular north step (not necessarily the last one) in the path. 


60 Combinatorics, Second Edition 


(a,b) 


(0,0) 


FIGURE 2.2 
Another combinatorial proof using lattice paths. 


2.17. The Chu—Vandermonde Identity. For all integers a,b,c > 0, 


.) = ee (Ot) 

a,b+c+l1 = k,b a—k,c )- 

Proof. Let S be the set of all lattice paths from the origin to (a,b +c+ 1). By the Lattice 
Path Rule, |S| = eee For 0 << k <a, let S, be the set of paths in S that contain the 
north step from (k,b) to (k,b+ 1). Since every path in S must cross the line y = b+ 1/2 
by taking a north step between the lines x = 0 and x = a, we see that S is the disjoint 
union of So, 91,...,Sq. See Figure 2.3. Now, we can build a path in S; as follows. First, 
choose a lattice path from the origin to (k,b) in Cy) ways. Second, append a north step 
to this path. Third, choose a lattice path from (k,b+ 1) to (a,b+c+1). This is a path ina 


rectangle of width a —k and height c, so there are a) ways to make this choice. Thus, 


|S,| = ey) -1- eae by the Product Rule. The identity in the theorem now follows from 


a—k,c 


the Sum Rule. O 


(a,b+c+1) 


(0,0) x 


FIGURE 2.3 
A third combinatorial proof using lattice paths. 
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2.6 Sums of Powers of Integers 


One often needs to evaluate sums of the form 1? + 2? +---+ n?, where p is a fixed integer. 
The next result gives formulas for these sums when p is 1, 2, or 3. 


2.18. Theorem: Sums of Powers. For all n € Zso, 
mn) 2 n(n + 1)(2n 4 1) 3 n(n+1)? 
k= k SEE ky? = ————.. 
1S > ee aan. 


Each summation formula may be proved by induction on n (Exercise 2-26). A combi- 
natorial proof of the sum-of-cubes formula is indicated in Exercise 2-27. Here we give an 
algebraic proof of these identities based on Theorem 2.16. Taking b= p+1anda=n-—p 
in that theorem, we see that 


(Des) peeG)=GEI) seneree os 


Letting p = 1 immediately yields 1+24+3+---+n = Ce: Letting p = 2, we get 
peer (5) a Ca): Since () = (k? — k)/2 for all k in the indicated range, we have 


n 


— 


Ia Ih, | (nt Dn(n=1) 
2 6 
k=1 k=1 


i) 


Using the previous formula for }*;_, k and solving for the sum of squares, we obtain 


2 mn +h) a2n(n+1)(n—1) — n(n+1)(Q2n+1) 


Now letting p = 3 in (2.2), we get 771 (5) = (7). eines i = k(k — 1)(k — 2)/6 = 
(k? — 3k? + 2k)/6, this becomes 


sre —ayo 42) k= oA 


k=1 


Inserting the previous formulas and doing some algebra, we eventually arrive at the formula 
for the sum of cubes. This method can be continued to evaluate the sum of fourth powers, 
fifth powers, and so on. The advantage of this technique compared to induction is that one 
does not need to know the final answer in advance. 


2.7 Recursions 


Suppose we are given some unknown quantities a9,Q1,...,@n,.--. A closed formula for 
these quantities is an expression of the form a, = f(n), where the right side is some explicit 
formula involving the integer n but not involving any of the unknown quantities a;. In 
contrast, a recursive formula for a, is an expression of the form a, = f(n, a0, 1,---,;@n—1), 
where the right side is a formula that does involve one or more of the unknown quantities 
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a;. A recursive formula is usually accompanied by one or more initial conditions, which are 
non-recursive expressions for ap and possibly other a,’s. Similar definitions apply to doubly 
indexed sequences a,x. 

Now consider the problem of counting sets of combinatorial objects. Suppose we have 
several related families of objects, say To,7),...,7n,..-.. We think of the index n as somehow 
measuring the size of the objects in T;,. Sometimes we can find a counting argument leading 
to a closed formula for |T,,|. In many cases, however, it is more natural to give a recursive 
description of T,, which tells us how to construct a typical object in T;, by assembling 
smaller objects of the same kind from the sets Jo,...,Z,—1. Such an argument leads to a 
recursive formula for |T,,| in terms of one or more of the quantities |To|,...,|Tn—i]. If we 
suspect that |T;,| is also given by a certain closed formula, we can then prove this fact using 
induction. The following examples illustrate these ideas. 


2.19. Example: Subset Recursion. For each integer n > 0, let T;, be the set of all 
subsets of {1,2,...,n}, and let a, = |T,,|. We derive a recursive formula for a, as follows. 
Suppose n > 1 and we are trying to build a typical subset A € T;,. We can do this recursively 
by first choosing a subset A’ C {1,2,...,2— 1} in any of |T,-1| = an—1 ways, and then 
either adding or not adding the element n to this subset (two possibilities). By the Product 
Rule, we conclude that 

Qn =Qyn-1°2 foralln>1. 


The initial condition is aj = 1, since Tp = {}. 
Using the recursion and initial condition, we calculate: 


(do, @1, 42, 3, G4, 5, oe ) = (1,2, 4,8, 16, 32, oe aye 


The pattern suggests that a, = 2” for all n > 0. (This already follows from the Power Set 
Rule, but our goal is to reprove this fact using our recursion.) We prove that a, = 2" by 
induction on n. In the base case (n = 0), we have ag = 1 = 2° by the initial condition. 
Assume that n > 0 and that a,—1 = 2”~' (this is the induction hypothesis). Using the 
recursion and the induction hypothesis, we see that 


dj = 9ecg =2O"-) 2% 
This completes the proof by induction. 


2.20. Example: Fibonacci Words. Let W,, be the set of all words in {0,1}” that 
do not have two consecutive zeroes, and let f, = |W,|. We now derive a recursion and 
initial conditions for the sequence of f,,’s. To obtain initial conditions, note that Wo = {e} 
(where € denotes the empty word), so fo = 1. Also W, = {0,1}, so fi = 2. To obtain a 
recursion, fix n € Zs. We use the Sum Rule to find a formula for |W,| = fn. The idea 
is to classify words in W,, based on their first symbol. Let Wi = {w € W,, : wi = 1} and 
Wr = {w € W, : wi = 0}. To build a word in W/, write a 1 followed by an arbitrary 
word of length n — 1 with no consecutive zeroes. There are |W,-1| = fn—1 words of the 
latter type, so |W/| = fn—1. On the other hand, to build a word in W,”, first write a 0. 
Since consecutive zeroes are not allowed, the next symbol of the word must be 1. We can 
complete the word by choosing an arbitrary word of length n—2 with no consecutive zeroes. 
This choice can be made in fp—2 ways, so |W/’| = fn—2. The Sum Rule now shows that 
Wal = [Wi] + IW2!), ot 
fn = fn-1t+ fn—2 for all n > 2. 


Using this recursion and the initial conditions, we compute 


(fo, fis fa; fs, fas fs, bes J = (1, 2,3, 5,8, 13, 21,34, 55, 89, 144, 233, ae Nk 
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The numbers f, are called Fibonacci numbers. We will find an explicit closed formula for 
fn in §2.16. 


The next example involves a doubly indexed family of combinatorial objects. Our goal 
is to reprove the Subset Rule by developing a recursion for counting k-element subsets of 


{1,2,...,n}. 


2.21. Example: Recursion for k-element Subsets. For all integers n, k > 0, let C(n, k) 
be the number of k-element subsets of {1,2,...,n}. We will find a recursion and initial 
conditions for the numbers C(n, k), then use this information to prove the earlier formula 
C(n,k) = ) a IGE! by induction. The recursion is provided by Pascal’s Identity 2.3. 
In the current notation, this identity says that 


C(n,k) =C(n-—1,k)+C(n-1,k-1) for0<k<n. 


Recall that the first term on the right counts k-element subsets of {1,2,...,} that do not 
contain n, whereas the second term counts those subsets that do contain n. Note that this 
reasoning establishes the recursion for C(n,k) without advance knowledge of the Subset 
Rule. To get initial conditions, note that C(n, k) = 0 whenever k > n; C(n,0) = 1 since the 
empty set is the unique zero-element subset of any set; and C(n,n) = 1 since {1,2,...,n} 
is the only n-element subset of itself. 

Now we use the recursion and initial conditions to prove C(n, k) = TICESOH forO<k<n. 
We proceed by induction on n. In the base case, n = k = 0, and we know C(0,0) = 1 = 


CEE Next, fix n > 0 and assume that for all 7 in the range 0 < 7 <n—1, 
(n—1)! 
C(n - 1,7) = —————.. 


Fix k with 0 < k < n. If k = 0, the initial condition gives C(n,k) = 1 = n!/(0!(n — 0)!) 
as needed. Similarly, the result holds when k = n. If 0 < k < n, we use the recursion and 
induction hypothesis (applied to 7 = k — 1 and to j = k, which are integers between 0 and 
n — 1) to compute 


O(n,k) = Olm—1,k-1)+CM—-1,8) 

(n —1)! (n — 1)! 
(-D\(in—1)—-@—D)!  Mn—D—P! 
(n-1)Ik | (n—1)!(n—k) 


k(n — k)! k(n — k)! 
. ee! _ n! 
— San ae 


This completes the proof by induction. (Note that this calculation recreates the algebraic 
proof of Pascal’s Identity.) 


The reader may wonder what good it is to have a recursion for C(n, k), since we already 
proved by other methods the explicit formula C(n, k) = IGE There are several answers 
to this question. One answer is that the recursion for the C(n, k)’s gives us a fast method 
for calculating these quantities that is more efficient than computing with factorials. One 
popular way of displaying this calculation is called Pascal’s Triangle. We build this triangle 
by writing the n + 1 numbers C(n,0),C(n,1),...,C(n,n) in the nth row from the top. If 
we position the entries as shown in Figure 2.4, then each entry is the sum of the two entries 
directly above it. We compute C(n,k) by calculating rows 0 through n of this triangle. 
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n=0: 1 

ee 1 1 

n= 2% 1 2 1 

n=3: 1 3 3 1 

m= 43 1 4 6 4 1 

N=: 1 5 10 10 5 1 

n=6: 1 6 15 20 15 6 1 

aie 1 7 21 35 35 21 7 1 

n=8: 1 8 28 56 70 56 28 8 1 
FIGURE 2.4 


Pascal’s Triangle. 


Note that computing C(n, &) via Pascal’s Recursion requires only addition operations. In 
contrast, calculation using the closed formula TICE! requires us to divide one large factorial 
by the product of two other factorials (although this work can be reduced somewhat by 
cancellation of common factors). For example, Pascal’s Triangle quickly yields C(8, 4) = 70, 
while the closed formula gives (§) aan an 70. 

In bijective combinatorics, it turns out that the arithmetic operation of division is much 
harder to understand from a combinatorial standpoint than the operations of addition and 
multiplication. In particular, our original derivation of the formula (7) = SIGE was an 


indirect argument using the Product Rule, in which we divided by k! at the end (see §1.4). 
For some applications (such as listing all k-element subsets of a given n-element set, dis- 
cussed in Chapter 6), it is necessary to have a counting argument that does not rely on 
division. 

A final reason for studying recursions for C(n,k) is to emphasize that recursions are 
helpful and ubiquitous tools for analyzing combinatorial objects. Indeed, we will soon be 
considering combinatorial collections whose cardinalities may not be given by explicit closed 
formulas. Nevertheless, these cardinalities satisfy recursions that allow them to be computed 
quickly and efficiently. 


DS 


2.8 Recursions for Multisets and Anagrams 


This section develops recursions for counting multisets and anagrams. 


2.22. Recursion for Multisets. In §1.12, we counted k-element multisets from an n- 
letter alphabet using bijective techniques. Now, we give a recursive analysis to reprove the 
enumeration results for multisets. For all integers n,k > 0, let M(n,k) be the number of 
k-element multisets using letters from {1,2,...,}. The initial conditions are M(n,0) = 1 
for all n > 0 and M(0,k) = 0 for all & > 0. We now derive a recursion for M(n, k) valid for 
n>Oandk > 0. A typical multiset counted by M(n,k) either does not contain n at all or 
contains one or more copies of n. In the former case, the multiset is a k-element multiset 
using letters from {1,2,...,n— 1}, and there are M(n — 1,k) such multisets. In the latter 
case, if we remove one copy of n from the multiset, we obtain an arbitrary (k — 1)-element 
multiset using letters from {1,2,...,n}. There are M(n,k —1) such multisets. By the Sum 


Combinatorial Identities and Recursions 65 


n= 1 0 0 0 0 0 0 
n= 1 1 1 1 1 il 1 
n=2: 1 2 3 4 5 6 7 
n= 1 3 6 10 15 21 28 
n=A: 1 4 10 20 35 56 84 
n= 1 5 15 35 70 126 210 
n= 1 6 21 56 126 252 462 
FIGURE 2.5 


Table for computing M(n,k) recursively. 


Rule, we obtain the recursion 
M(n,k)=M(n-—1,k)+M(n,k-1) forn>Oandk>0. 


It can now be proved by induction that for all n > 0 and all k > 0, M(n,k) = eae? 
The proof is similar to the corresponding proof for C(n,k), so we omit it. 

We can use the recursion to compute values of M(n, k). Here we use a left-justified table 
of entries in which the nth row contains the numbers M(n,0),M(n,1),.... The values in 
the top row (where n = 0) and in the left column (where k = 0) are given by the initial 
conditions. Each remaining entry in the table is the sum of the number directly above it 
and the number directly to its left. See Figure 2.5. This table is a tilted version of Pascal’s 
Triangle. 


2.23. Recursion for Multinomial Coefficients. Let ni,...,n; be nonnegative integers 
that add to n. Let {a1,...,as} be a given s-letter alphabet, and let C(n;n1,...,ns) = 
|R(a}'---a®s)| be the number of n-letter words that are rearrangements of n; copies of a; 
for 1 <i < s. We proved in §1.5 that C(n;n1,...,s) = Gow = Sit We now 
give a new proof of this result using recursions. 

Assume first that every n; is positive. For 1 <i < s, let T; be the set of words in 
T = Ray" ---a¥s) that begin with the letter a;. T is the disjoint union of the sets T),..., Ts. 
To build a typical word w € T;, we start with the letter a; and then append any element of 
R(att..-am—)...a™). There are O(n — 1;n1,...,n; —1,...,ms) ways to do this. Hence, 
by the Sum Rule, 


C(n;n1,...,s) => > C(n-1;m,...,mj—1,..., 7). 
i=1 


If we adopt the convention that C(n;n1,...,ns) = 0 whenever any n; is negative, then 
this recursion holds (with the same proof) for all choices of nj > 0 and n > 0. The initial 
condition is C(0;0,0,...,0) = 1, since the empty word is the unique rearrangement of zero 
copies of the given letters. 

Now let us prove that O(n;ni,...,ms) = n!/]];_, nx! by induction on n. In the base 
case, n = ny +--+ =n, = 0, and the required formula follows from the initial condition. 
For the induction step, assume that n > 0 and that 


(n —1)! 


C(n—1;m,...,™Ms) => Th mal 
k=1 e* 
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whenever m, +---+m, =n-—1. Assume that we are given integers n,,...,n; > 0 that 
sum to n. Using the recursion and induction hypothesis, we compute as follows: 


s 
C(n;n1,...,Ns) = SCG Tittige sey te Tpccagt) 
k=1 
- (n= 1)! 
Shee MUTT jee Ma! 
ne >O 
7 ‘2 (n — 1)!np% = (n — 1)!nz 
> = pf aa 
1<k<s T]j=1 "3 k=1 TTj=1 a 
np >O 


(n—1)! |< n! 
7 ez nj! yom 7 TTj-1 mj! 


2.9 Recursions for Lattice Paths 


Recursive techniques allow us to count many collections of lattice paths. We first consider 
the situation of lattice paths in a rectangle. Compare the next result to Identity 2.14. 


2.24. Recursion for Paths in a Rectangle. For a,b > 0, let L(a,b) be the number of 
lattice paths from the origin to (a,b). We have L(a,0) = L(0,b) = 1 for all a,b > 0. Ifa >0 
and b > 0, note that any lattice path ending at (a,b) arrives there via an east step or a 
north step. We obtain lattice paths of the first kind by taking any lattice path ending at 
(a—1,b) and appending an east step. We obtain lattice paths of the second kind by taking 
any lattice path ending at (a,b — 1) and appending a north step. Hence, by the Sum Rule, 


L(a,b) = L(a—1,b)+ L(a,b-1) for all a,b > 0. 
It can now be shown (by induction on a+ 6) that 


! 
L(a,b) = ¢ Oe eeciiah Su 
a, 


alo! 
We can visually display and calculate the numbers L(a,b) by labeling each lattice point 
(a,b) with the number L(a,b). The initial conditions say that the lattice points on the axes 
are labeled 1. The recursion says that the label of some point (a, b) is the sum of the labels 
of the point (a — 1,b) to its immediate left and the point (a,b — 1) immediately below it. 
See Figure 2.6, in which we recognize yet another shifted version of Pascal’s Triangle. 


By modifying the boundary conditions, we can adapt the recursion in the previous 
example to count more complicated collections of lattice paths. 


2.25. Recursion for Paths in a Triangle. For b > a > 0, let T(a,b) be the number 
of lattice paths from the origin to (a,b) that always stay weakly above the line y = x. (In 
particular, T(n,n) is the number of Dyck paths of order n.) By the same argument used 
above, we have 


T(a,b)=T(a—1,b)+T(a,b—-1) forb>a>0. 


On the other hand, when a = b > 0, a lattice path can only reach (a,b) = (a,a) by taking 
an east step, since the point (a,b —1) lies below y = x. Thus, 


T(a,a)=T(a—1,a) fora>0. 
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FIGURE 2.6 
Recursive enumeration of lattice paths. 


FIGURE 2.7 
Recursive enumeration of lattice paths in a triangle. 


The initial conditions are T(0,b) = 1 for all b > 0. Figure 2.7 shows how to compute 
the numbers T(a,b) by drawing a picture. We see the Catalan numbers 1,1,2,5,14,... 
appearing on the main diagonal. 


It turns out that there is an explicit closed formula for the numbers T(a, b), which are 
called ballot numbers. 


2.26. Theorem: Ballot Numbers. For 6 > a > 0, the number of lattice paths from the 
origin to (a,b) that always stay weakly above the line y = z is 


ai 


b+at+l1 a 


at a 


In particular, the number of Dyck paths of order n is C;, = 
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Proof. We show that 


b-a+l1/fa+b4+l1 
(a8) = Fert ') 


by induction on a+b. If a+b = 0, so that a = b = 0, then T(0,0) = 1 = ira ae Cae ie Now 


assume that a+b > 0 and that T (c,d) = doctl (etd+t) whenever d > c > Oandc+d < a+b. 
To prove the claimed formula for T(a,b), we consider cases based on the recursions and 
initial conditions. First, if a = 0 and b > 0, we have T(a,b) = 1= BHOeL (00 +1) Second, if 


b+04+1\ 0 
a=b> 0, we have 


T(a,b) = Maa) =Ta-1a) == (%*) 


(2a)! 1 é as ‘ 


al(a+l)! 2a+l a 


= b—-a+l1/fa+b+l1 
pga, a 
Third, if b > a > 0, we have 


T(a—1,b) + T(a,b yee) oer") 


l| 


T(a,b 
ia; 6) a+b a-—1l a+b a 


(b—a+2)(a+b—1)! i (b—a)(a+b—1)! 
(a —1)!(b+ 1)! alb! 
a a+2)  (b—a)(bt+ ~| (a + b)! 
a+b a+b al(b+ 1)! 
es (a+b+1)! 
a+b (b4+a+4 1)al(b+1)! 


[Coon | 


a+b b+a+l a 
b—-a+l 
= a ana! qo 
b+at+l1 a 


The numbers T’(a, b) in the previous theorem are called ballot numbers for the following 
reason. Let w € {N, E}*+® be a lattice path counted by T(a, b). Imagine that a +b people 
are voting for two candidates (candidate N and candidate E) by casting an ordered sequence 
of a+ b ballots. The path w records this sequence of ballots as follows: w; = N if the jth 
person votes for candidate N, and w,; = E if the jth person votes for candidate E. The 
condition that w stays weakly above y = x means that candidate N always has at least as 
many votes as candidate E at each stage in the election process. The condition that w ends 
at (a,b) means that candidate N has 6 votes and candidate E has a votes at the end of the 
election. 

Returning to lattice paths, suppose we replace the boundary line y = x by the line 
y = max (where m is any positive integer). We can then derive the following more general 
result. 


2.27. Theorem: m-Ballot Numbers. Let m be a fixed positive integer. For b > ma > 0, 
the number of lattice paths from the origin to (a,b) that always stay weakly above the line 
y =mz is 

“amet 


b+at+l a 
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In particular, the number of such paths ending at (n, mn) is 


1 oe), 


(m+1)n4+1 n 


Proof. Let T,(a,b) be the number of paths ending at (a,b) that never go below y = mz. 
Arguing as before, we have T,,,(0, b) = 1 for all b > 0; T,,(a,b) = Tm(a— 1,6) + Tim(a, 6-1) 
whenever b > ma > 0; and T,,(a, ma) = T,(a—1, ma) since the point (a, ma— 1) lies below 
the line y = mz. It can now be proved that 


b— 1 b+1 
Tn(ay0) = h(a + ) 


b+a+l1 a 


by induction on a + b. The proof is similar to the one given above, so we omit it. For a 
bijective proof of this theorem, see Exercise 12-3. O 


When the slope m of the boundary line y = mz is not an integer, we cannot use 
the formula in the preceding theorem. Nevertheless, the recursion (with appropriate initial 
conditions) can still be used to count lattice paths bounded below by this line. For example, 
Figure 2.8 illustrates the enumeration of lattice paths from (0,0) to (6,9) that always stay 
weakly above y = (3/2)a. 


1: 8: 35: 105: 241: 377: °377 


1. 7: 27. 70. 136 136. 


FIGURE 2.8 
Recursive enumeration of lattice paths above y = (3/2)z. 
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2.10 Catalan Recursions 


The recursions from the previous section provide one way of computing Catalan numbers, 
which are special cases of ballot numbers. This section discusses another recursion that 
involves only the Catalan numbers. This convolution recursion appears in many settings, 
thus leading to many different combinatorial interpretations for the Catalan numbers. 


2.28. Theorem: Catalan Recursion. The Catalan numbers C,, = aT 7”) satisfy the 
recursion 


Cn =) >Ck-1Cn-z for alln > 0, 
k=1 


with initial condition Co = 1. 


Proof. Recall from the Dyck Path Rule 1.101 that C;, is the number of Dyck paths of order 
n. There is one Dyck path of order 0, so Co = 1. Fix n > 0, and let A be the set of Dyck 
paths ending at (n,n). For 1 <k <n, let Ax be the set of Dyck paths of order n that return 
to the diagonal line y = «x for the first time at the point (k,k). See Figure 2.9. Suppose w 


(n,n) 


(k,k) 


(0,0) 


FIGURE 2.9 
Proving the Catalan recursion by analyzing the first return to y = a. 


is the word in {N, E}?" that encodes a path in Ax. Inspection of Figure 2.9 shows that we 
have the factorization w = Nw Ewe, where w; encodes a Dyck path of order k — 1 (starting 
at (0,1) in the figure) and wz encodes a Dyck path of order n — & (starting at (k,k) in the 
figure). We can uniquely construct all paths in A; by choosing w; and w2 and then setting 
w = Nw Ew. There are C,_1 choices for w; and C,_, choices for w2. By the Product Rule 


and Sum Rule, 


Cn =|Al = 5° |Agl = 55 Cu-1Cn—e- Oo 


k=1 k=1 


We now show that the Catalan recursion uniquely determines the Catalan numbers. 
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2.29. Proposition. Suppose do, d),...,dn,... is a sequence such that dy = 1 and d, = 
Whe Ak—-1dn—x for all n > 0. Then dy, = 1-(°”) for all n > 0. 


n = HHI 


Proof. We argue by strong induction. For n = 0, we have dg = 1 = Cp. Assume that n > 0 
anid thatdg = C, for allan< mn. Then d,— >, 7 tei dae = Fg Ci Cn = Ge 


We can now prove that various collections of objects are counted by the Catalan numbers. 
One proof method sets up a bijection between such objects and other objects (like Dyck 
paths) that are already known to be counted by Catalan numbers. A second proof method 
shows that the new collections of objects satisfy the Catalan recursion. We illustrate both 
methods in the examples below. 


2.30. Example: Balanced Parentheses. For n > 0, let BP, be the set of all words 
consisting of n left parentheses and n right parentheses, such that every left parenthesis can 
be matched with a right parenthesis later in the word. For example, BP3 consists of the 
following five words: 


CO) COO OCOD) (COO) OOO 


We show that |BP,,| = C,, for all n by exhibiting a bijection between BP,, and the set of 
Dyck paths of order n. Given w € BP,,, replace each left parenthesis by N (which encodes a 
north step) and each right parenthesis by E (which encodes an east step). It can be checked 
that a string w of n left and n right parentheses is balanced iff for every 1 < 2n, the number 
of left parentheses in the prefix w,w2---w; weakly exceeds the number of right parentheses 
in this prefix. Converting to north and east steps, this condition means that no lattice point 
on the path lies strictly below the line y = x. Thus we have mapped each w € BP, to a 
Dyck path. This map is a bijection, so |BP,,| = Ch. 


2.31. Example: Binary Trees. We recursively define the set of binary trees with n nodes 
as follows. The empty set is the unique binary tree with 0 nodes. If J) is a binary tree with 
h nodes and Tp» is a binary tree with k nodes, then the ordered triple T = (e,7;, 72) is a 
binary tree with h + 4+ 1 nodes. By definition, all binary trees arise by a finite number of 
applications of these rules. If T = (e, 7,72) is a binary tree, we call T, the left subtree of 
T and T> the right subtree of T. Note that T, or Tz (or both) may be empty. We can draw 
a picture of a nonempty binary tree T as follows. First, draw a root node of the binary tree 
at the top of the picture. If 7, is nonempty, draw an edge leading down and left from the 
root node, and then draw the picture of T;. If Tj is nonempty, draw an edge leading down 
and right from the root node, and then draw the picture of 72. For example, Figure 2.10 
displays the five binary trees with three nodes. Figure 2.11 depicts a larger binary tree that 
is formally represented by the sequence 


T = (¢, (0, (¢, (¢,0,0), (0, 0,0)), (¢,0,0)), (@, (0,0, (#, (0, 0,0), 0)), 0)). 


Let BT, denote the set of binary trees with n nodes. We show that |BT,,| = C,, for all n by 

verifying that the sequence (|BT;,| : n > 0) satisfies the Catalan recursion. First, |BTo| = 1 
by definition. Second, suppose n > 1. By the recursive definition of binary trees, we can 
uniquely construct a typical element of BT), as follows. Fix k with 1 < k < n. Choose a 
tree T, € BT;,_1 with k — 1 nodes. Then choose a tree Ty € BT,,_, with n — k nodes. We 
assemble these trees (together with a new root node) to get a binary tree T = (e, 71,72) 
with (k —1)+1+(n—k) =n nodes. By the Sum Rule and Product Rule, we have 


Br = ¥ (Bi 4| Brel. 
k=1 


It follows from Proposition 2.29 that |BT,,| = C,, for all n > 0. 
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FIGURE 2.10 

The five binary trees with three nodes. 


FIGURE 2.11 
A binary tree with ten nodes. 


2.32. Example: 231-avoiding permutations. Suppose w = wiw2:::Wp is a permu- 
tation of n distinct integers. We say that w is 231-avoiding iff there do not exist indices 
i < k < p such that wy < wi < wy. This means that no three-element subsequence 
Wi... Wk-+-Wp in w has the property that w, is the smallest number in {w;, wz, wp} and 
wx is the largest number in {w,, wz, wp}. For example, when n = 4, there are fourteen 
231-avoiding permutations of {1, 2,3, 4}: 


1234, 1243, 1324, 1423, 1432, 2134, 2143, 
3124, 3214, 4123, 4132, 4213, 4312, 4321. 


The following ten permutations do contain occurrences of the pattern 231: 
2314, 2341, 2431, 4231, 3421, 3412, 3142, 1342, 3241, 2413. 


Let S2%! be the set of 231-avoiding permutations of {1,2,...,n}. We prove that |$2°"| = C, 

for all n > 0 by verifying the Catalan recursion. First, |S3°1| = 1 = Cp since the empty 
permutation is certainly 231-avoiding. Next, suppose n > 0. We construct a typical object 
w € $23! as follows. Consider cases based on the position of the letter n in w. Say wp =n. 
For alli < k and all p > k, we must have w; < wp; otherwise, the subsequence w;, we = N, Wp 
would be an occurrence of the forbidden 231 pattern. Assuming that w; < w, whenever 
i < k < p, it can be checked that w = w w2-::Wpy is 231-avoiding iff wyw2---wr_-1 is 
231-avoiding and wy41--+ Wn is 231-avoiding. aes, for a fixed k, we can apna w by 


choosing an arbitrary 231-avoiding permutation w’ of the k — 1 letters {1,2,...,4 —1} in 
|S234,| ways, then choosing an arbitrary 231-avoiding permutation w” of the n — letters 
{k,...,n—1} in |9$?31 | ways, and finally letting w be the concatenation of w’, the letter n, 


and a! . By the Sum Rule and Product Rule, we have 


is) = S- ioe ists |. 


k=1 


By Proposition 2.29, |$?%1| = C,, for all n > 0. 
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2.33. Example: r-avoiding permutations. Let 7 : {1,2,...,k} > {1,2,...,k} bea 
fixed permutation of & letters. A permutation w of {1,2,...,} is called r-avoiding iff there 
do not exist indices 1 < i(1) < 1(2) <-+- < i(k) < n such that 


Beet). Satay Ss ay 


This means that no subsequence of k entries of w consists of numbers in the same relative 
order as the numbers 7), 72,...,7%. For instance, w = 15362784 is not 2341-avoiding, since 
the subsequence 5684 matches the pattern 2341 (as does the subsequence 5674). On the 
other hand, w is 4321-avoiding, since there is no decreasing subsequence of w of length 4. 
Let S7 denote the set of 7-avoiding permutations of {1,2,...,n}. 

For general 7, the enumeration of t-avoiding permutations is an extremely difficult 
problem that has received much attention in recent years. On the other hand, if 7 is a 
permutation of k = 3 letters, then the number of t-avoiding permutations of length n is 
always the Catalan number Cy, for all six possible choices of 7. We have already proved 
this in the last example for 7 = 231. The arguments in that example readily adapt to prove 
the Catalan recursion for 7 = 132, 7 = 213, and 7 = 312. However, more subtle arguments 
are needed to prove this result for 7 = 123 and r = 321 (see Theorem 12.84). 


2.34. Remark: Catalan Bijections. Let (A, :n > 0) and (B, : n > 0) be two families of 
combinatorial objects such that |A,| = C, = |B,,| for all n. Suppose that we have an explicit 
bijective proof that the numbers |A,,| satisfy the Catalan recursion. This means that we can 
describe a bijection g,, from the set A,, onto the union of the disjoint sets {k} x Ag_1 x An—x 
for k = 1,2,...,n. (Such a bijection can often be constructed from a counting argument 
involving the Sum Rule and Product Rule.) Suppose we have similar bijections h,, for the sets 
B,,. We can combine these bijections to obtain recursively defined bijections f, : An > Bn. 
First, there is a unique bijection fo : Ag > Bo, since |Ao| = 1 = |Bo|. Second, fix n > 0, 
and assume that bijections fim: Am — Bm have already been defined for all m < n. Define 
fn : An 2 By as follows. Given « € A,, suppose g,(x) = (k,y,z) where 1 < k < n, 
YE Ap_1, and z € An_x. Set 


fr(w) = ha (ky fe-1(y), fn—e(2)). 


The inverse map is defined analogously. 

For example, let us recursively define a bijection ¢ from the set of binary trees to the set 
of Dyck paths such that trees with n nodes map to paths of order n. Linking together the 
first-return recursion for Dyck paths with the left/right-subtree recursion for binary trees 
as discussed in the previous paragraph, we obtain the rule 


¢(0) =e (the empty word); —o((¢, 71, T2)) = No(Ti)Eg(T2). 


For example, the one-node tree (e,@,) maps to the Dyck path NeEe = NE. It then follows 
that 
6((, (0,0, 0),0)) = N(NE)Ee = NNEE; 


o((e, 0, (e, 0, 0))) = NeE(NE) = NENE; 
o((e, (@, 0,0), (e, 0,0))) = N(NE)E(NE) = NNEENE; 


and so on. Figure 2.12 illustrates the recursive computation of ¢(T) for the binary tree T 
shown in Figure 2.11. 

As another example, let us recursively define a bijection w from the set of binary trees 
to the set of 231-avoiding permutations such that trees with n nodes map to permutations 
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J \, > NNEENE i __- NNEE 


Pa } —+ NENNEE 
/\ —+ NNNEENEENE 
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/ 


Ke 


> NNNNEENEENEENNENNEEE 


FIGURE 2.12 
Mapping binary trees to Dyck paths. 


of n letters. Linking together the two proofs of the Catalan recursion for binary trees and 
231-avoiding permutations, we obtain the rule 


vO=€, P((e,T1,T2)) = o(Ni) ny" (Ta), 


where 7’(T2) is the permutation obtained by increasing each entry of w(T2) by k-—1 = |Ti]. 
Figure 2.13 illustrates the recursive computation of w(T') for the binary tree T shown in 
Figure 2.11. 


= 


2.11 Integer Partitions 


This section introduces integer partitions, which are ways of writing a given positive integer 
as a sum of positive integers. Integer partitions are similar to compositions, but here the 
order of the summands does not matter. For example, 1+3+3 and3+1+3and3+3+41 
are three different compositions of 7 that represent the same integer partition of 7. By 
convention, the summands in an integer partition are listed in weakly decreasing order, as 
in the following formal definition. 


2.35. Definition: Integer Partitions. Let n be a nonnegative integer. An integer partition 
of n is a sequence (= (11, 2,---, Ue) of positive integers such that ~oy + 2 +--+, =n 
and f41 > fg > ++: > pe. Each pu; is called a part of the partition. Let p(n) be the number of 
integer partitions of n, and let p(n, k) be the number of integer partitions of n into exactly 
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—+ 312 


FIGURE 2.13 
Mapping binary trees to 231-avoiding permutations. 


k parts. If u is a partition of n into k parts, we write |u| =n and ¢(w) = k and say that pu 
has area n and length k. 


2.36. Example. The integer partitions of 5 are 
(5), (4,1), (3,2), (3,1,1), (2,2,1), (2,1,1,1), G,1,1,1,1). 


Thus, p(5) = 7, p(5,1) = 1, p(5, 2) = 2, p(5,3) = 2, p(5,4) = 1, and p(5,5) = 1. As another 
example, the empty sequence is the unique integer partition of 0, so p(0) = 1 = p(0,0). 

When presenting integer partitions, we often use the notation j*7 to abbreviate a se- 
quence of a; copies of 7. For example, the last three partitions of 5 in the preceding example 
could be written (27,1), (2, 1°), and (1°). We now describe a way of visualizing integer par- 
titions pictorially. 


2.37. Definition: Diagram of a Partition. Let uw = (1, )2,...,u,) be an integer 
partition of n. The diagram of p is the set 


dg(u) = {(t,7) € Zs0 x Zso : 1<i<k, Lg < pi}. 


We can make a picture of dg(j) by drawing an array of n boxes, with ju; left-justified 
boxes in row 7. For example, Figure 2.14 illustrates the diagrams for the seven integer 
partitions of 5. Note that |u| = ui +---+ ux = |dg()| is the total number of unit boxes in 
the diagram of yz, which is why we call |u| the area of yw. The length ¢(~) is the number of 
rows in the diagram of w. 

We know from the Composition Rule that there are 2”—! compositions of n. There is no 
simple analogous formula for the partition number p(n). Fortunately, the numbers p(n, k) 
do satisfy the following recursion. 
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(4,1) (2,1,1,1) 
(3,2) 
(1,1,1,1,1) 
(3,1,1) 


FIGURE 2.14 
Partition diagrams. 


2.38. Recursion for Integer Partitions. For all positive integers n and k, 


The initial conditions are p(n,k) = 0 for k > n or k < 0, p(n,0) = 0 for n > 0, and 
p(0,0) = 1, 


Proof. Fix n,k € Zso. Let A be the set of integer partitions of n into k parts, so |A] = p(n, k) 
by definition. Write A = BUC, where B= {we A: py =1} andC={ue A: py, > 1}. 
In terms of partition diagrams, B consists of those partitions in A with a single box in the 
lowest row, and C’ consists of those partitions in A with more than one box in the lowest 
row. We can build an object in B by choosing any partition v of n — 1 into k — 1 parts 
(in any of p(n — 1,k — 1) ways), then adding one new part of size 1 at the end. Thus, 
|B| = p(n—1,k-1). Figure 2.15 illustrates the case n = 13, k = 5. We can build an object 
in C' by choosing any partition v of n — k into k parts (in any of p(n — k,k) ways), then 
adding 1 to each of the k parts of v. This corresponds to adding a new column of k cells to 
the diagram of v; see Figure 2.15. We conclude that |C| = p(n — k,k). The recursion now 
follows from the Sum Rule. The initial conditions are readily verified. oO 


It is visually evident that if we interchange the rows and columns in a partition diagram, 
we obtain the diagram of another integer partition. This leads to the following definition. 


2.39. Definition: Conjugate Partitions. Suppose y is an integer partition of n. The 
conjugate partition of us is the unique integer partition py’ of n satisfying 


dg(u’) = {G,4) : (4,9) € dg(u)}. 


Figure 2.16 shows that the conjugate of = (7,4,3,1,1) is uw’ = (5,3,3,2,1,1,1). Taking 
the conjugate twice restores the original partition: yu” = y for all integer partitions yp. 
Conjugation leads to the following new interpretation for the numbers p(n, k). 


2.40. Proposition. The number of integer partitions of n into k parts (namely, p(n, k)) 
equals the number of integer partitions of n with first part k. 
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n-1=12, k-1=4 n=13, k=5 


add new cell 


n-k=8, k=5 n=13, k=5 


add left column 


> 


FIGURE 2.15 
Proof of the recursion for p(n, k). 


conjugate 
= > 


(7,4,3,1,1) 


(5,3,3,2,1,1,1) 


FIGURE 2.16 
Conjugate of a partition. 


Proof. Note that the length of a partition y is the number of rows in dg(j), whereas the 
first part of y is the number of columns of dg(). Conjugation interchanges the rows and 
columns of a diagram. So, the map sending yp to py’ is a bijection from the set of integer 
partitions of n with & parts onto the set of integer partitions of n with first part k. O 


Our next result counts integer partitions whose diagrams fit in a box with b rows and a 
columns. 


2.41. Theorem: Partitions in a Box. The number of integer partitions such that 
dg(m) C {1,2,...,b} x {1,2,...,a} is 


a+b\ (a+b)! 
(5) = aa 


a,b) ald! 


Proof. We define a bijection between the set of integer partitions in the theorem statement 
and the set of all lattice paths from the origin to (a,b). We draw our partition diagrams 
in the box with corners (0,0), (a,0), (0,6), and (a,b), as shown in Figure 2.17. Given a 
partition y whose diagram fits in this box, the southeast boundary of dg(js) is a lattice path 
from the origin to (a, b). We call this lattice path the frontier of u (which depends on a and 
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m=(10, 10, 5, 4, 4, 4, 2) 


(0,b) (a,b) 


(0,0) (a,0) 


FIGURE 2.17 
Counting partitions that fit in an a x b box. 


b as well as 4). For example, if a = 16, b = 10, and w = (10,10,5,4,4,4,2), we see from 
Figure 2.17 that the frontier of ju is 


NNNEENEENNNENEEEEENNEEEEEFE. 


Conversely, given any lattice path ending at (a,b), the set of lattice squares northwest of 
this path in the box uniquely determines the diagram of an integer partition. We already 
know that the number of lattice paths from the origin to (a,b) is Co, so the theorem 


follows from the Bijection Rule. O 


2.42. Remark: Euler’s Partition Recursion. Our recursion for p(n, k) gives a quick 
method for computing the quantities p(n,k) and p(n) = S>7_, p(n,k). The reader may 
wonder if the numbers p(n) satisfy any recursion. In fact, Euler’s study of the infinite 
product []>2, (1 — 2") leads to the following recursion for p(n) when n > 0: 


p(n) = (—1)""* [p(n — m(3m — 1)/2) + p(n — m(3m + 1)/2)] 
= p(n—1)+p(n—2)—p(n—5) — p(n— 7) + p(n — 12) + p(n — 15) 


p(n — 22) — p(n — 26) + p(n — 35) + p(n — 40) — p(n — 51) — p(n — 57) 4---. 


The initial conditions are p(0) = 1 and p(j) = 0 for all 7 < 0. It follows that, for each 
fixed n, the recursive expression for p(n) is really a finite sum, since the terms become zero 
once the input to p becomes negative. For example, Figure 2.18 illustrates the calculation 
of p(n) from Euler’s recursion for 1 < n < 12. We will prove Euler’s recursion later (see 
Theorem 5.55). 


DT 


2.12 Set Partitions 


An integer partition decomposes a positive integer into a sum of smaller integers. In contrast, 
the set partitions introduced below decompose a given set into a union of disjoint subsets. 
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pil) = p(0)=1 

p(2) = p(l)+p(0)=1+1=2 

p(3) = p(2)+p)=2+1=3 

p(4) = p(3)+p(2)=3+2=5 

p(5) = p(4)+p(3) - p(0) =5+3-1=7 

p(6) = p(5)+p(4)-pQ)=74+5-1=11 

p(7) = p(6)+p(5) — p(2) — p(0) =11+7-2-1=15 

p(8) = p(7)+p(6) — p(3) — p(t) = 15+ 11-—3-1= 22 

p(9) = p(8)+p(7) — p(4) — p(2) = 22+ 15-5 -—2 = 30 

p(10) p(9) + p(8) — p(5) — p(3) = 304+ 22 —7-—3 = 42 

pill) = p(10) + p(9) — p(6) — p(4) = 424+ 30-—11-—5 = 56 

p(12) = p(11)+ p(10) — p(7) — p(5) + p(0) = 56 + 42 —-15-74+1=77 
FIGURE 2.18 


Calculating p(n) using Euler’s recursion. 


2.43. Definition: Set Partitions. Let X be a set. A set partition of X is a collection P 
of nonempty, pairwise disjoint subsets of X whose union is X. Each element of P is called 
a block of the partition. The cardinality of P (which may be infinite) is called the number 
of blocks of the partition. 


For example, if X = {1,2,3,4,5,6,7,8}, then 
P = {{3, 5, 8}, ae Th {2}, {4, 6h} 


is a set partition of X with four blocks. Note that the ordering of the blocks in this list, and 
the ordering of the elements within each block, is irrelevant when deciding the equality of 
two set partitions. For instance, 


{{6, 4}, {1, 7}, {2}, (5,8, 3}} 


is the same set partition as P. We can visualize a set partition P by drawing the elements 
of X in a circle, and then drawing smaller circles enclosing the elements of each block of P. 
See Figure 2.19. 


2.44. Definition: Stirling Numbers and Bell Numbers. Let S(n,k) be the number 
of set partitions of {1,2,...,n} into exactly & blocks. S(n,k) is called a Stirling number 
of the second kind. Let B(n) be the total number of set partitions of {1,2,...,n}. B(n) is 
called a Bell number. 


One can check that S(n,k) is the number of partitions of any given n-element set 
into k blocks; similarly for B(n). Unlike binomial coefficients, there are no simple closed 
expressions for Stirling numbers and Bell numbers (although there are summation formulas 
and generating functions for these quantities). However, the Stirling numbers satisfy a 
recursion that can be used to compute S(n,k) and B(n) rapidly. 


2.45. Recursion for Stirling Numbers of the Second Kind. For alln > 0 and k > 0, 
S(n,k) = S(n-—1,k-1)+kS(n—-1,k). 
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FIGURE 2.19 


Diagram of the set partition {{3,5, 8}, {1, 7}, {2}, {4, 6}}. 


The initial conditions are $(0,0) = 1, S(n,0) = 0 for n > 0, and S$(0,k) = 0 for k > 0. 
Furthermore, B(0) = 1 and B(n) = S07_, S(n,k) for n > 0. 


Proof. Fix n,k > 0. Let A be the set of set partitions of {1,2,...,n} into exactly k blocks. 
Let Ao = {PE A: {n} © P} and A” ={PeEA: {n} ¢ P}. A is the disjoint union of A’ 
and A”, where A’ consists of those set partitions such that n is in a block by itself, and A” 
consists of those set partitions such that n is in a block with some other elements. To build 
a typical partition P € A’, we first choose an arbitrary set partition Pp of {1,2,...,n —1} 
into k — 1 blocks in any of S(n — 1,4 — 1) ways. Then we add the block {n} to Po to get 
P. To build a typical partition P € A”, we first choose an arbitrary set partition P; of 
{1,2,...,2— 1} into & blocks in any of S(n — 1,k) ways. Then we choose one of these k 
blocks and add n as a new member of that block. By the Sum Rule and Product Rule, 


S(n,k) =|A| =|A"|+|A”| = S(n—1,k —1) + kS(n -1,h). 


The initial conditions are immediate from the definitions, noting that P = @ is the unique 
set partition of X =. The formula for B(n) follows from the Sum Rule. O 


Figure 2.20 computes S(n,k) and B(n) for n < 8 using the recursion. The entry S(n, k) 
in row n and column k is computed by taking the number immediately northwest and adding 
k times the number immediately above the given entry. The numbers B(n) are found by 
adding the numbers in each row. 

Next we study a recursion satisfied by the Bell numbers. 


2.46. Recursion for Bell Numbers. For all n> 0, 


n—1 


20=>- (";, ))B(n-1-2) 


k=0 
The initial condition is B(0) = 1. 


Proof. For n > 0, we construct a typical set partition P counted by B(n) as follows. Let 
k be the number of elements in the block of P containing n, not including n itself; thus, 
0<k<n-—1. To build P, first choose k elements from {1,2,...,n—1} that belong to 


the same block as n in any of ee ways. Then, choose an arbitrary set partition of the 
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k=0 k=1 k=2 k=3 k=4 k=5 k=6 kK=7 k=8 Bin) 
n=0: 1 0 0) 0 0 0 0 0 0 1 
n=1: 0 1 0 0 0 0 0 0 0 1 
N= 2 0 1 1 0 0 0 0 0 0 2 
n=3: 0 1 3 1 0 0 0) 0 0 5 
ee 0 1 7 6 1 0 0 0 0 15 
n=5: 0 1 15 25 10 1 0 0 0 52 
n=6: 0 1 31 90 65 15 1 0 0 203 
m= T: 0 1 63 301 350 140 21 1 0 877 
n=8: 0 il 127 966 1701 1050 266 28 1 4140 

FIGURE 2.20 


Calculating S(n,k) and B(n) recursively. 


n—1—k elements that do not belong to the same block as n; this choice can be made in any 
of B(n—1—k) ways. The recursion now follows from the Sum Rule and Product Rule. O 


For example, assuming that B(m) is already known for m < 8 (see Figure 2.20), we 


calculate 
T 7 7 7 
bass B 
({) B07) + (1) B06) + (3) 80) + + (7) (0) 
= 1-877+7-2034 21-524+35-154+35-5+4+21-2+7-14+1-1 


4140. 


B(8) 


DT 


2.13 Surjections, Balls in Boxes, and Equivalence Relations 


This section shows how set partitions and Stirling numbers can be used to count surjections 
and equivalence relations. We also revisit the balls-in-boxes problem from §1.13. Recall that 
a function f : X — Y is surjective (or onto) iff for all y € Y there exists « € X with 


f(z) =y. 


2.47. The Surjection Rule. The number of surjections from an n-element set onto a 
k-element set is k!S(n,k), where S(n,k) is a Stirling number of the second kind. 


Proof. Without loss of generality, we may assume we are counting surjections with domain 
{1,2,...,n} and codomain {1,2,...,k}. To build such a surjection f, first choose a set 
partition P of {1,2,...,n} into & blocks in any of S(n, k) ways. Choose one of these blocks 
(in k ways), and let f map everything in this block to 1. Then choose a different block (in 
k — 1 ways), and let f map everything in this block to 2. Continue similarly; at the last 
stage, there is 1 block left, and we let f map everything in this block to k. By the Product 
Rule, the number of surjections is 


S(n,k)-k-(k—1)+...-1=kS(n,k). Oo 


2.48. Example. To illustrate this proof, suppose n = 8 and k = 4. In the first step, say we 
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choose the set partition P = {{1, 4,7}, {2}, {3, 8}, {5,6}}. In the next four steps, we choose 
a permutation (linear ordering) of the four blocks of P, say 


{2}, {5, 6}, {3, 8}, {1, 4, 7}. 


Now we define the associated surjection f by setting 


f(2) =1, FS) = f(6) =2, F(3) = f(8) =3, fA) = f4) = f(7) = 4. 


There is also a recursion for surjections similar to the recursion for set partitions; see 
Exercise 2-52. Using integer partitions, set partitions, and surjections, we can now continue 
our discussion of the balls-in-boxes problem from §1.13. 


2.49. Theorem: Balls in Nonempty Boxes. Consider distributions of a balls into b 
boxes where every box must be nonempty. 

(a) For labeled balls and labeled boxes, there are b!.S(a, b) distributions. 

(b) For unlabeled balls and labeled boxes, there are (¢7}) distributions. 

(c) For labeled balls and unlabeled boxes, there are S(a,b) distributions. 

(d) For unlabeled balls and unlabeled boxes, there are p(a,b) distributions. 


Proof. (a) As in Rule 1.93, we can model the placement of balls in boxes by a function 
f : {1,2,...,a} + {1,2,...,b}, where f(x) = y means that ball « is placed in box y. The 
requirement that every box be nonempty translates to the requirement that f is surjective. 
So (a) follows from the Surjection Rule. Part (b) was proved in Rule 1.94. 

(c) We can model a distribution of balls labeled 1,2,...,a into b unlabeled boxes as a set 
partition of {1,2,...,a} with b blocks, where each block consists of the set of balls placed in 
the same box. For example, if a = 8 and b = 4, the set partition {{3,5, 8}, {1, 7}, {2}, {4, 6}} 
models the distribution where one box contains balls 3, 5, and 8; another box contains balls 
1 and 7; another box contains ball 2; and another box contains 4 and 6. Thus, the number 
of distributions in (c) is the Stirling number S(a, b). 

(d) We can model a distribution of a unlabeled balls into b unlabeled boxes as an integer 
partition of a into b parts, where each part counts the number of balls in one of the boxes. 
For example, if a = 8 and b = 4, the integer partition (3,3,1,1) models the distribution 
where two boxes contain three balls and two boxes contain one ball. Thus, the number of 
distributions in (d) is the partition number p(a, b). Oo 


The rest of this section assumes the reader has some previous exposure to equivalence 
relations. We review the relevant definitions now. 


2.50. Definition: Types of Relations. Let X be any set. A relation on X is any subset 
of X x X. If R is a relation on X and z,y € X, we may write «Ry as an abbreviation 
for (x,y) € R. We read this symbol as “a is related to y under R.” A relation R on X is 
reflexive on X iff eRa for alla € X. R is irreflexive on X iff «Ra is false for alla e X. R 
is symmetric iff for all z,y € X, «Ry implies yRa. R is antisymmetric iff for all x,y € X, 
zRy and yRz imply « = y. R is transitive iff for all x,y,z © X, eRy and yRz imply «rRz. 
R is an equivalence relation on X iff R is symmetric, transitive, and reflexive on X. If R 
is an equivalence relation and a) € X, the equivalence class of xq relative to R is the set 
[colrn = {ye X : uoRy}. 


2.51. Theorem: Set Partitions and Equivalence Relations. Suppose X is a fixed 
set. Let A be the set of all set partitions of X, and let GB be the set of all equivalence 
relations on X. There are canonical bijections ¢: A > B and ¢’:B > A. If P € A, then 
the number of blocks of P equals the number of equivalence classes of ¢(P). Hence, the 
Stirling number $(n,k) is the number of equivalence relations on an n-element set having 
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k; equivalence classes, and the Bell number B(n) is the number of equivalence relations on 
an n-element set. 


Proof. We only sketch the proof, leaving many details as exercises. Given a set partition 
P €A, define a relation ¢(P) on X by 


O(P) ={(a,y)E XxX: ASE Pwe Sand ye S}. 


In other words, x is related to y under ¢(P) iff « and y belong to the same block of P. The 
reader may check that $(P) is indeed an equivalence relation on X, i.e., that o(P) € B. 
Thus, ¢ is a well-defined function from A into B. 

Given an equivalence relation R € B, define 


¢'(R) = {[t]a : 0 € X}. 
In other words, the blocks of ¢’(R) are precisely the equivalence classes of R. The reader 
may check that ¢’(R) is indeed a set partition of X, ie., that ¢/(R) € A. Thus, ¢’ is a 
well-defined function from 6 into A. 
To complete the proof, one must check that ¢ and @¢’ are two-sided inverses of one 


another. In other words, for all P € A, ¢'(¢(P)) = P; and for all R € B, 6(¢’(R)) = R. It 
follows that ¢ and ¢’ are bijections. oO 


DS 


2.14 Stirling Numbers and Rook Theory 


Recall that the Stirling numbers S(n,k) count set partitions of an n-element set into k 
blocks. This section gives another combinatorial interpretation of these Stirling numbers. 
We show that S(n,k) counts certain placements of rooks on a triangular chessboard. A 
slight variation of this setup leads us to introduce the (signless) Stirling numbers of the 
first kind. The relationship between the two kinds of Stirling numbers will be studied in the 
following section. 


2.52. Definition: Ferrers Boards and Rooks. A Ferrers board is the diagram of an 
integer partition, viewed as a collection of unit squares as in §2.11. A rook is a chess piece 
that can occupy any of the squares in a Ferrers board. In chess, a rook can move any 
number of squares horizontally or vertically from its current position in a single move. A 
rook located in row i and column j of a Ferrers board attacks all squares in row 7 and all 
squares in column j. 


For example, in the Ferrers board shown below, the rook attacks all squares on the board 
marked with a dot (and its own square). 


For each n > 0, let A, denote the diagram of the partition (n — 1,n — 2,...,3,2,1). An is 
a triangular Ferrers board with n(n — 1)/2 total squares. For example, As is shown below. 
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2.53. Definition: Non-attacking Rook Placements. A placement of k rooks on a given 
Ferrers board is a subset of k squares in the Ferrers board. These k squares represent the 
locations of k identical rooks on the board. A placement of rooks on a Ferrers board is 
called non-attacking iff no rook occupies a square attacked by another rook. Equivalently, 
all rooks in the placement occupy distinct rows and distinct columns of the board. 


2.54. Example. The following diagram illustrates a non-attacking placement of three rooks 
on the Ferrers board corresponding to the partition (7, 4, 4, 3, 2). 


2.55. Rook Interpretation of Stirling Numbers of the Second Kind. For n > 0 
and0<k <n, let S’(n,k) denote the number of non-attacking placements of n — k rooks 
on the Ferrers board A,,. If nm > 1 and0<k <n, then 


S'(n,k) = S'(n-1,k—1) + kS'(n—-1,k). 


The initial conditions are S’(n,0) = 0 for n > 0 and S’(n,n) = 1 for n > 0. Therefore, 
S’(n,k) = S(n,k), a Stirling number of the second kind. 


Proof. Fixn > 1 with0O<k <n. Let A, B, and C denote the set of placements counted by 
S'(n,k), S’(n—1,k—1), and S’(n—1,k), respectively. Let Ap consist of all rook placements 
in A with no rook in the top row, and let A; consist of all rook placements in A with a rook 
in the top row. A is the disjoint union of Ap and A;. Deleting the top row of the Ferrers 
board A,, produces the smaller Ferrers board A,,_1. It follows that deleting the (empty) top 
row of a rook placement in Ao gives a bijection between Ap and B (note that a placement 
in B involves (n — 1) — (k — 1) = n—k rooks). On the other hand, we can build a typical 
rook placement in A, as follows. First, choose a placement of n — k — 1 non-attacking rooks 
from the set C, and use this rook placement to fill the bottom n— 1 rows of A,,. These 
rooks occupy n—k—1 distinct columns. This leaves (n —1)—(n—k-—1) = k columns in the 
top row in which we are allowed to place the final rook. By the Product Rule, |Ai| = |C|k. 
Using the Sum Rule, we conclude that 


S"(n, k) = |A| = |Ao| + |Ai| = |B] + kI|C| = S’(n -—1,k -—1) + kS'(n —1,k). 


For n > 0, we cannot place n non-attacking rooks on the Ferrers board A,, (which has 
only n — 1 columns), and hence $’(n,0) = 0. On the other hand, for any n > 0 there 
is a unique placement of zero rooks on A,,. This placement is non-attacking (vacuously), 
and hence $’(n,n) = 1. Counting set partitions, we see that S(n,0) = 0 for n > 0 and 
S(n,n) = 1 for n > 0. Since S’(n,k) and S(n,k) satisfy the same recursion and initial 
conditions, a routine induction argument (cf. Proposition 2.29) shows that S’(n,k) = S(n, k) 
for all n and k. O 


2.56. Remark. We have given combinatorial proofs that the numbers S’(n, k) and S(n, k) 
satisfy the same recursion. We can link together these proofs to get a recursively defined 
bijection between rook placements and set partitions, using the ideas in Remark 2.34. We can 
also directly define a bijection between rook placements and set partitions. We illustrate 
such a bijection through an example. Figure 2.21 displays a rook placement counted by 
S’(8,3). We write the numbers 1 through n below the last square in each column of the 
diagram, as shown in the figure. We view these numbers as labeling both the rows and 
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Ez 6 


FIGURE 2.21 
A rook placement counted by $’(8, 3). 


columns of the diagram; note that the column labels increase from left to right, while row 
labels decrease from top to bottom. The bijection between non-attacking rook placements 
m and set partitions P acts as follows. For all 7 < i <n, there is a rook in row i and column 
j of m iff ¢ and j are consecutive elements in the same block of P (writing elements of the 
block in increasing order). For example, the rook placement 7 in Figure 2.21 maps to the 
set partition 

P = {{1,3,4,5, 7}, {2, 6}, {8} }. 


The set partition {{2}, {1,5, 8}, {4,6,7}, {3}} maps to the rook placement shown in Fig- 
ure 2.22. It can be checked that a non-attacking placement of n—k rooks on A, corresponds 
to a set partition of n with exactly k blocks; furthermore, the rook placement associated to 
a given set partition is automatically non-attacking. 


Ez 8 
7 
= 6 


FIGURE 2.22 
The rook placement associated to {{2}, {1,5, 8}, {4, 6, 7}, {3}}. 


By modifying the way a rook can move, we obtain the Stirling numbers of the first kind 
described in the following definition. 


2.57. Definition: File Rooks and Stirling Numbers of the First Kind. A file rook is 
a new chess piece that attacks only the squares in its column. For alln > 0 andO0<k <n, 
let s’(n, k) denote the number of placements of n — k& non-attacking file rooks on the Ferrers 
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board A,,. The numbers s/(n,k) are called signless Stirling numbers of the first kind. The 
numbers s(n, k) = (—1)"~*s'(n,k) are called (signed) Stirling numbers of the first kind. By 
convention, we set s(0,0) = 1 = s’(0,0). 


Stirling numbers of the first kind also count permutations of n letters consisting of k 
disjoint cycles; we discuss this combinatorial interpretation in §3.6. 


2.58. Recursion for Signless Stirling Numbers of the First Kind. For all n > 1 
and0<k<n, 


s(n, k) = s'(n—1,k —1)+ (n—1)s'(n—-1,k). 
The initial conditions are s’(n,0) = 0 for n > 0 and s’(n,n) = 1 for n > 0. 


Proof. Assume n > 1 and0< k <n. Let A, B, and C denote the set of file rook placements 
counted by s‘(n,k), s’(n —1,k —1), and s’(n — 1,k), respectively. Write A as the disjoint 
union of Ag and A;, where Ag is the set of placements in A with no file rook in the leftmost 
column, and Aj, is the set of placements in A with a file rook in the leftmost column. We 
can get a bijection from Ag to B by deleting the empty leftmost column of a placement in 
Ag. On the other hand, we can build a typical file rook placement in A; as follows. First, 
choose the position of the file rook in the leftmost column of A,, in n — 1 ways. Second, 
choose any placement of n — k — 1 non-attacking file rooks from the set C, and use this 
placement to fill the remaining n — 1 columns of A,,. These file rooks do not attack the file 
rook in the first row. By the Sum Rule and Product Rule, 


s!(n,k) = |Al = |Aol + Ai] = |B] + (n— DIC] = s'(n— 1, k-1) + (n—Vs'(n- 1h). O 


We can use the recursion and initial conditions to compute the (signed or unsigned) 
Stirling numbers of the first kind. This is done in Figure 2.23; compare to the computation 
of Stirling numbers of the second kind in Figure 2.20. 


k=0 k=1 k=2 k=3 k=4 k=5 k=6 k=7 
n=0: 1 0 0 0 0 0 0 0 
m= 1: 0 1 0 0 0 0 0 0 
n=2: 0 —1 1 0 0 0 0 0 
n= 3: 0 2 —3 1 0 0 0 0 
n=A4: 0 —6 11 —6 1 0 0 0 
m= 5: 0 24 —50 35 —10 1 0 0 
n=6: 0 —120 274 —225 85 —15 1 0 
w= 7: 0) 720 —-1764 1624 —735 175 = =—21 1 

FIGURE 2.23 


Signed Stirling numbers of the first kind. 


There is a surprising relation between the two arrays of numbers in these figures, which 
explains the extra signs in the definition of s(n, k&). Specifically, for any fixed n > 0, consider 
the lower triangular nxn matrices A = (s(7,7))1<ij<n and B = (S(i,j))1<i,j<n- It turns out 
that A and B are inverse matrices. The reader may check this for small n using Figure 2.20 
and Figure 2.23. We will prove this fact for all n in the next section. 
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2.15 Stirling Numbers and Polynomials 


Our recursions for Stirling numbers (of both kinds) lead to algebraic proofs of certain 
polynomial identities. These identities express various polynomials as linear combinations 
of other polynomials, where the Stirling numbers are the coefficients involved in the linear 
combination. Thus, the Stirling numbers appear as entries in the transition matrices between 
some common bases for the vector space of one-variable polynomials. This linear-algebraic 
interpretation of Stirling numbers will be used to show the inverse relation between the two 
triangular matrices of Stirling numbers (cf. Figures 2.20 and 2.23). 


2.59. Polynomial Identity for Stirling Numbers of the Second Kind. For all n > 0 
and all real x, 


x” =~ S(n,k)x(x —1)(—2)-+-(@-k +1). (2.3) 
k=0 


We give an algebraic proof based on the Stirling recursion, as well as a combinatorial 
proof based on rook placements. 
Algebraic Proof. Recall that S(0,0) = 1 and S(n,k) = S(n—-—1,k-—1)4+ kS(n —1,k) for 
n >1l1and1< k < n. We prove (2.3) by induction on n. If n = 0, the right side is 
S$(0,0) = 1 = 2°, so the identity holds. For the induction step, fix n > 1 and assume that 
a®—l = 273 S(n—1,k)a(a —1)-+- (ew —k +1). Multiplying both sides by x = (x —k) +k, 
we can write 


— y S(n—1,k)a(a—1)---(a-—k+1)(a —k) 
k=0 
StL vee ee 
k=0 
= : S(0 — 1, j)ar(a —1)--- (ae — f) + ST kS(a— 1, Bla(e —1)---(@ B+ D). 
j=0 k=0 


In the first summation, replace 7 by k — 1. The calculation continues: 


= Ss Eee ees kS(n—1,k)a(a—1)---(a@-—k+1) 
k=1 k=0 

= Se ee ee Omen mice tar 
k=0 k=0 


= > bO-1k1)4 kee te ee) 


k=0 
n 


= S°S(n,k)a(@—-1)---(e—k+1), 


k=0 


where the second equality uses the initial conditions S(n — 1,—1) = S(n—1,n) = 0, and 
the final equality uses the recursion for S(n, k). 

Combinatorial Proof. Both sides of (2.3) are polynomials in x, so it suffices to verify the 
identity when x is a positive integer (see Remark 2.7). Fix 7,n € Zyo. Consider the extended 


88 Combinatorics, Second Edition 


1) obtained by adding x new rows 


Ferrers board A,(x) = (n,...,n,n-1,n-2,n—- an Syd, 
1). For example, A5(3) is the board 


of length n above the board Re =(n-—1,n- a 
shown here. 


Let A be the set of placements of n non-attacking rooks on A,,(a#). One way to build a 
typical rook placement in A is to place one rook in each column, working from right to left. 
The rook in the rightmost column can go in any of x squares. The rook in the next column 
can go in any of (x + 1) — 1 = a squares, since one row is now attacked by the rook in the 
rightmost column. In general, the rook located 7 > 0 columns left of the rightmost column 
can go in any of (a +7) —7 = x squares, since 7 distinct squares in this column are already 
attacked by rooks placed previously when we choose the position of the rook in this column. 
The Product Rule therefore gives |A| = x”. 

On the other hand, for 0 < k <n, let A, be the set of rook placements in A in which 
there are exactly & rooks in the « new rows (and hence n — & rooks on the board A,,). To 
build a typical rook placement in Ax, first place n — k non-attacking rooks in A,, in any of 
S(n,k) ways. There are now k unused columns of new squares, each of which consists of x 
squares. Visit these columns from left to right, placing one rook in each column. There are 
x choices for the first rook, then «— 1 choices for the second rook (since the first rook’s row 
must be avoided), then « — 2 choices for the third rook, and so on. By the Product Rule, 
|Ay| = S(n, k)a(a —1)(a —2)--- (a —k +1). Since A is the disjoint union of Ap, A1,..., An, 
the Sum Rule gives 


|A| = > S(n, k)a(@ — 1)(@ - 2)-+- (a —k +1). 


The identity follows by comparing our two formulas for |A]. 


2.60. Polynomial Identity for Signless Stirling Numbers of the First Kind. For 
all n > 0 and all real z, 


n 


a(a+1)(a+2)---(2+n—-1)= 5° s'(n,k)z*. (2.4) 


k=0 


Proof. We ask the reader to supply an algebraic proof in Exercise 2-73; we give a combina- 
torial proof here. Recall that s’(n,&) counts placements of n — k non-attacking file rooks on 
A. Fix an integer « > 0, and let A be the set of placements of n non-attacking file rooks 
on the extended Ferrers board A,,(x). Placing the file rooks on the board one column at a 
time, working from right to left, the Product Rule shows that 


|A] = x(a + 1)(~@+2)---(a@+n-—1). 


On the other hand, we can write A as the disjoint union of sets Az, where A, consists of 
the file rook placements in A in which exactly k file rooks occupy the new squares above 
A,. To build an object in Ax, first place n — k non-attacking file rooks in A, in any of 
s'(n,k) ways. There are now k unused columns, each of which has x new squares, and k 
file rooks left to be placed. Visit the unused columns from left to right, and choose one 
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of the x new squares in that column to be occupied by a file rook. By the Product Rule, 
|A;| = s'(n,k)ax*. The Sum Rule now gives 


n 


|A] = $5 8'(n, kak. 


k=0 
Equating the two expressions for |A| completes the proof. oO 


The proof technique of using extended boards such as A,,(x) can be recycled to prove 
other results in rook theory, as we will see in §12.3. 


2.61. Polynomial Identity for Signed Stirling Numbers of the First Kind. For all 
n > 0 and all real x, 


n 


(a —1)(z— 2)---(2-n+1) = )_ s(n, k)2*. (2.5) 
k=0 


Proof. Replace x by —« in (2.4) to obtain 
(—a)(—2 + 1)(—a + 2)---(—a+n—1) = s'(n,k)(—2)*. 


Factoring out —1’s, we get 


n 


(-1)"a(a — 1)(@ — 2)-+-(a@—n+1) = 5°(-1)*s!(n, k)a*. 


k=0 


Moving the (—1)” to the right side and recalling that s(n,k) = (—1)"+*s'(n, k), the result 
follows. O 


2.62. Summation Formula for Stirling Numbers of the First Kind. For all n > 1 
and 1<k<n, we have 


s'(n,k) = Se iia ++ + in—k- 


1<i1 <ig<:+-<tin_,<n-1 


Proof. Recall that s’(n,&) counts the number of placements of n—k non-attacking file rooks 
on the triangular Ferrers board A,,. Since file rooks only attack cells in their columns, a 
placement is non-attacking iff all file rooks occupy distinct columns of A,,. Let us classify 
file rook placements based on which columns contain file rooks. Suppose the n — k file rooks 


appear in the columns of lengths 71, 72,...,%—~, where 1 < t1 < ig < +++ <in_zp <n—1. The 
Product Rule shows that the number of placements of file rooks in these rows is i172 ++ -in—r. 
The formula in the theorem now follows from the Sum Rule. O 


To understand the linear-algebraic significance of the preceding results, we need to in- 
troduce some bases for the vector space V of all polynomials in one variable with real 
coefficients. (See the Appendix for a review of the linear algebra concepts used here.) 


2.63. Definition: Monomial, Falling Factorial, and Rising Factorial Bases. For 
any integer n > 0, define the falling factorial polynomials 


(t)\lo= 1, (a)\n= a(x —1)(@ — 2)---(@—n +1) 
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and the rising factorial polynomials 
(x)to=1, (a)tn=a(e@t1)(a+2)---(a@+n—-1). 


The monomial basis of V is the indexed set M = {x” : n > O}. The falling factorial basis 
of V is F = {(2)]n: n > 0}. The rising factorial basis of V is R = {(x)tn: n > 0}. Define 
Men = {2":0<n< N}, and define Fey and Rey similarly. 


It can be checked that any indexed collection of polynomials {p,(a) : n > 0} such that 
deg(pn) = n for all n is a basis of V. Since x”, (x){n, and (x)tp all have degree n, it follows 
that M, F’, and R are indeed bases of V. The three indexed collections Mcn, F<n, and 
Ren are all bases of the subspace V<y consisting of polynomials in x of degree at most N. 

We can now recast the preceding theorems in the language of linear algebra. Recall that 
if B = (v,...,Un) and C = (wj,...,wW,) are two ordered bases of a finite-dimensional 
vector space W, the transition matrix from B to C is the unique n x n matrix A = (a;;) 
such that 


Vj = So aig wi for 1 <j < nN. (2.6) 
i=1 


From linear algebra, we know that A is invertible, and A+ is the transition matrix from C 
to B. 


2.64. Theorem: Transition Matrices between Polynomial Bases. Fix N > 0. 

(a) The matrix S = (S(n, k))o<n,n<n of Stirling numbers of the second kind is the transpose 
of the transition matrix from the basis M<y to the basis F<yn. 

(b) The matrix s’ = (s'(n,k))o<n,n<w of signless Stirling numbers of the first kind is the 
transpose of the transition matrix from R<y to M<n. 

(c) The matrix s = (s(n,k))o<n,k<w of signed Stirling numbers of the first kind is the 
transpose of the transition matrix from F<y to M<y. 

(d) The (N +1) x (N +1) matrices S and s are inverses of one another. 


Proof. The first three statements follow from Equations (2.3), (2.4), (2.5), and the defini- 
tion of transition matrices (2.6). The final statement is a special case of the fact that the 
transition matrix from B to C is the inverse of the transition matrix from C' to B. O 


Part (d) of the theorem says that we have matrix identities Ss = I = sS, where I is the 
(N +1) x (N +1) identity matrix. Writing what this means entry by entry, we obtain the 
formulas 

S" S(i,k)s(k, j) = bij = So (i, k)S(k,9) for i, 7 > 0, 
k k 
where 6;; is 1 if? = j and 0 if i ¢ 7. A combinatorial proof of the second equality will be 
given later (see Example 4.25). 


2.16 Solving Recursions with Constant Coefficients 


In Example 2.20, we promised to find an explicit closed formula for the Fibonacci numbers. 
This section addresses the more general problem of finding exact solutions to certain types 
of recursions. We focus initially on solving recursions of the form 


In = ban-1 +cC%n_-2 forn> 2, 
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where b and ¢ are real constants, and (x, : n > 0) is an unknown real sequence satisfying 
the given recursion. We introduce the general method with a specific example. 


2.65. Example. Let us seek solutions to the recursion @, = 24,1 +8%n—2 (where n > 2). 
The key initial idea is to try sequences of the form x, = r”, where r is a fixed nonzero 
real number. Substituting into the recursion, we see that r must satisfy the equations 
r? = Ir"! 8r"—? for alln > 2. Dividing by r”~?, we see that these equations are equivalent 
to the single condition r? — 2r — 8 = 0. Factoring this quadratic, we get (r + 2)(r — 4) = 0. 
So r = —2 or r = 4, yielding solutions x, = (—2)” and x, = 4” for the given recursion. 

The next key observation is that any linear combination of solutions to the recursion 
is also a solution. More specifically, if sequences (un :n > 0) and (un : n > 0) solve the 
recursion and s,t are any scalars, then x, = su, + tu, also solves the recursion. To verify 
this, compute 


2@n-1 +882 = 2(sUn—1 + tUn—1) + 8(sUn—2 + tun_2) 
= 8(2Un-1 + 8Un—2) + t(2Upn_1 + 8Un—2) 


= slntt¥n, =X forn> 2. 


We now have an infinite family of solutions to the original recursion, namely x, = s-(—2)"+ 
t-4” for arbitrary scalars s,t € R. In fact, we will see momentarily that all real solutions 
of the recursion have this form. 

To pick out one specific solution from our infinite family of solutions, we need two initial 
conditions. As an example, suppose we know xo = 7 and 2; = 10. Using these values in the 
formula vp, = s-(—2)"+t-4”, we find that 7 = s+¢ and 10 = —2s + 4t. This is a system of 
two linear equations in the two unknowns s and t. Solving this system, we find that s = 3 
and t = 4, so ap) = 3(—2)" + 4" is a solution to the given recursion satisfying the given 
initial conditions. 

More generally, suppose the initial conditions had been x9 = co and x; = c, for fixed 
constants co and c,. Here we would need to solve the system of linear equations s + t = co 
and —2s+4t = c;. Since the coefficient matrix of this system is invertible, there is always a 
unique solution (s, t); specifically, Cramer’s Rule gives s = (4cg —c1)/6 and t = (2co+c1)/6. 
Thus the given recursion and initial conditions can be solved by exactly one sequence of the 
form tp, = s-(—2)" +t- 4”. 

Now we can explain why all solutions to the recursion must have this form. Suppose 
(Zn :n > 0) is any sequence satisfying the recursion. By the previous paragraph, there exist 
scalars s,¢ such that the sequence x, = s-(—2)"+#-4” satisfies the recursion and initial 
conditions 29 = Z and x; = z;. We can now prove by strong induction that z, = x, for 
all n > 0. This is true for n = 0 and n = 1 by the initial conditions. For fixed n > 2, we 
may assume by induction that zm, = %» for 0 < m < n. Since both sequences satisfy the 
recursion, we have Zp = 22,-1 + 82n-2 = 2%n-1 + 8%n-2 = Ln, completing the induction. 


The following result can be proved by generalizing the reasoning in the last example. 


2.66. Theorem: Solutions to x, = a%j_1 + b&yn_2. Suppose a, b are real constants such 
that r? — ar — b = 0 has two distinct real roots r; # rg. Given any co,c1 € R, there exist 
unique s,t € R such that the recursion 7, = a%y,_1 + b&%,_2 with initial conditions 29 = co, 
“1 = Cz is solved by vy, = srf +trz. Every real solution to the recursion has this form. 


The polynomial r? — ar — b is called the characteristic polynomial of the recursion 
In = AXy_1 + bXy_2. 


2.67. Example: Fibonacci Recursion. We use the above method to solve F, = 
Fy-1 + Fn—2 with initial conditions Fo = 0 and Fi = 1. (The sequence f,, considered 
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in Example 2.20 is a shift of this one, namely f,, = F,42.) Here a = b = 1, so we must find 
the roots of r? — r— 1 =0. By the Quadratic Formula, the roots are r; = (1 + V5)/2 and 
rg = (1— V5)/2. Thus, the general solution to the recursion is sr} +tr3. To find s and t, we 
use the initial conditions. We obtain the system of equations s+t = 0, r1s+rgt = 1. Solving 
leads to s = 1/5 and t = —1/\/5. Thus an exact formula for the Fibonacci numbers is 


(1+ V5)" — (1— V5)” 
27/5 


Note that r; ~ 1.618 and rg © —0.618, so limn+orf = 0. Thus, the subtracted term in 
the formula for F,, becomes negligible for large n, giving the approximation F,, © r''/V5. 


Ff, = for alln > 0. 


Everything said so far works equally well for complex sequences. In particular, the roots 
of the characteristic polynomial may be complex even if the recursion we are solving is real. 
Our solution method still applies as long as the two roots are distinct. 


2.68. Example. Let us solve the recursion t, = 6%n—1 — 25%n_2 with initial conditions 
xo = 0, x; = 1. The characteristic polynomial is r? — 6r + 25, which has complex roots 
ry = 3+ 4% and rg = 3 — 4i. The general (complex) solution of the recursion is « = 
s(3 + 42)” + (3 — 4)” where s,t € C. Using the initial conditions, we must have s + t = 0 
and s(3+42)+t(3— 47) = 1. Solving this system of linear equations in C, we get s = —i/8 and 
t = 1/8. Thus the particular solution to the given problem is x, = [—i(3+4i)"+7(3—42)”]/8. 
Although this formula involves complex numbers, each 2, is real (as can be seen by induction 
on n, using the recursion and initial conditions). 


A special case occurs when the characteristic polynomial has a double root, i.e., r?—ar—b 
factors as (r — r,)*. The next example shows what to do in this situation. 


2.69. Example. Let us try to find all real solutions to the recursion 2, = 6%,—1 —9%y_2 for 
n > 2. Guessing that x, =r” for all n, we must have r” = 6r”—-1_9r"-? for alln > 2, which 
leads (as before) to the characteristic equation r?—6r+9 = 0. In this case, r; = 3 is the only 
root of this equation, giving x, = 3” as one solution to the recursion. More generally, for any 
real constant s, , = s- 3” also solves the recursion, but we have not yet found all possible 
solutions. A new trick is needed here: we guess a solution might have the form x, = nr” for 
some fixed r. Substituting into the recursion, we need nr” = 6(n—1)r"~!—9(n—2)r"~? for 
all n > 2. Dividing by r”~?, we need nr? = 6(n — 1)r — 9(n — 2) for all n > 2. Rearranging 
this, we need n(r? — 6r +9) + 6(r — 3) = 0 for all n > 2. This forces r? —6r+9=0=r-3, 
which holds precisely when r = 3. Thus, x, = n3” is a new solution to the recursion. Taking 
linear combinations, we obtain the general solution x, = (s + ¢n)3” for real constants s, t. 
If we also have initial conditions x9 = cp and x; = c), we can find s and t by solving the 
linear system co = 8, Cc, = 3(s +t). Evidently, the unique solution is s = co, t = c1/3 — co. 
By the same argument used in our original example, it follows from this that all solutions 
of the recursion have the form x, = (s + tn)3”. 


These ideas can be extended to solve recursions of the form rt, = a1X%yj_1 + AQ%@pn—2 + 
-+++@qtn—a for n > d, where d > 1 is fixed and aj,...,aq are given constants not depending 
on n. We only state the solution technique here, deferring the full proof until §11.5. 


2.70. Method for Solving Recursions with Constant Coefficients. Let a),...,aq 
be fixed real or complex numbers with ag 4 0. To solve the recursion 


Ln = A1Xn-1 + AQUn—-2 +++ +Ad%n—-a for n> d, 


first find all complex roots of the characteristic polynomial r?—a yr?! —agr?-? —---—ag. If 
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r is aroot of this polynomial with multiplicity m, then each of the m sequences (r” : n > 0), 
(nr™ :n > 0), (n?r™: n> 0), ..., (n™ tr” sn > 0) is a basic solution to the recursion. As 
r varies through all roots of the characteristic polynomial, we obtain d basic solutions. The 
general solution to the recursion is a linear combination (with complex scalars) of these d 
basic solutions. 

To find the particular solution satisfying d given initial conditions, regard the d coeffi- 
cients in the linear combination as unknowns, and substitute the general solution into the 
initial conditions to obtain a system of d linear equations in these unknowns. This sys- 
tem will have a unique solution giving the particular linear combination of basic solutions 
satisfying these initial conditions. 


2.71. Example. Let us find the general solution of the recursion 
Ln = TXyn—-1 — 19%y_2 + 254-3 —-16%y,_4+4¢%p_5 forn> 5d. 


The characteristic polynomial r° — 7r* + 19r3 — 25r? + 16r — 4 factors as (r—1)3(r—2)?. The 
root r = 1 gives three basic solutions x, = 1" = 1, 7, = nl” = n, and x, = n71" = n?. 
The root r = 2 gives two more basic solutions 2, = 2” and 2, = n2”. The general solution 
is 


In = C1 + egn + cgn? + 42” +¢5n2”  forn>0. 


To find the particular solution satisfying initial conditions (xo, #1, £2, %3, 4) = (0,0,1,3,1), 
we take n = 0,1,2,3,4 in the general solution to obtain the system of linear equations 
cy tc, = 0, cy teg+e3+2ce4+2c5 = 0, €1 +2c2+4e3+4ce4+8e5 = 1, c1+302+9¢3+8cC4+24c5 = 
3, C1 + 4co + 16c3 + 16c4 + 64c5 = 1. After some linear algebra, we find c, = —15, co = —8, 
c3 = —2, cg = 15, and cs = —5/2. So the particular solution is 


Tn = —15 — 8n — 2n? +: 15-2" — (5/2)n2”_ forn > 0. 


Once this solution is found, we can use strong induction to prove that it does indeed satisfy 
the given recursion. 


Summary 


e Combinatorial Proofs. To prove a formula of the form a = 6 combinatorially, define an 
appropriate set S of objects, give a counting argument showing that |S = a, then give 
a second counting argument showing that |S| = b. 


e Some Binomial Coefficient Identities. 


a) Symmetry of Binomial Coefficients: (7) = (,",)- 


b) Pascal's Identity: () = ("") + (iia) (We) = (Ee) + ()- 
c) Sum of Binomial Coefficients: }7;_, ({) = 2”. . 

d) Sum of Squared Binomial Coefficients: Sy, (2) as edb 

e) Sum of Diagonal of Pascal’s Triangle: 77 Ce) = are 


f) The Chu-Vandermonde Identity: $7/_9 Ga) Gs) = aaa P 


e Geometric Series. For real or complex r 4 1, 7y_9 ar® = a(1 — r"t!)/(1—1). 
When |r| < 1, Opp ar*® =a/(1—71). 


e The Binomial Theorem. For all real x,y and n € Zo, («@ + y)” = peg ()akyr™. 


94 


Combinatorics, Second Edition 
Multinomial Theorems. When 21,...,2; commute, 
n 
(21 + 22 ++++ +25)" = a (apna ong) thas 

nytnet-+ns=n 

When Z;,...,Zs do not necessarily commute, 
(Zi+Zot--+ Ze)" = So Lu Zug Zu: 
we€{l,...,s}” 

Sums of Powers of Integers. For all integers n > 0, 


n(n + 1) 9 n(n+1)(2n+1) 3 n(n+1)? 
k = ———., ke = ——_._-, k? = ————.. 
ao = 


n n 
k= 


Recursions. A collection of combinatorial objects can often be described recursively, 
by using smaller objects of the same kind to build larger objects. Induction arguments 
can be used to prove facts about recursively defined objects. The recursion and appro- 
priate initial conditions uniquely determine the quantities under consideration. If two 
collections of objects satisfy the same recursion and initial conditions, one can link to- 
gether two combinatorial proofs of the recursion to obtain recursively defined bijections 
between the two collections. 


Fibonacci Recursion. The Fibonacci numbers satisfy F;, = Fyn-1 + Fy—2 for n > 2, with 
initial conditions Fp = 0 and F; = 1. (Other initial conditions are sometimes used.) For 
n> 0, Fr42 counts words in {0,1}" with no consecutive zeroes. 


Subset Recursion. Let C(n,k) be the number of k-element subsets of an n-element set. 
Then C(n,k) = C(n —- 1,k) + C(n — 1,k — 1) for 0 < k < n, with initial conditions 
C(n,0) = C(n,n) = 1 for n > 0, and C(n,k) =0 fork <Oork>n. 


Multiset Recursion. Let M(n, k) be the number of k-element multisets using an n-letter 
alphabet. Then M(n,k) = M(n—1,k)+M(n,k-1) for n,k > 0, with initial conditions 
M(n,0) =1 for n > 0 and M(0,k) =0 fork > 0. 


Anagram Recursion. Let C(n;n1,...,%s) be the number of rearrangements of n1 copies 
of one letter, m2 copies of another letter, and so on. Then C(n;ni,...,Ns) = )23_, C(n—- 
1;n1,...,n; —1,...,s) for n > 0, with initial conditions C(n;n1,...,ns) = 0 if any 
n; <0, and C(0;0,...,0) = 1. 


Lattice Path Recursions. For m € Zso, the m-ballot numbers T,,,(a,b) count lattice 
paths from (0,0) to (a,b) that stay weakly above y = ma. These numbers satisfy 
Tm(a,b) = Tm(a — 1,0) + Tn(a,b — 1) for b > ma > 0, Tm(a,ma) = Tm(a — 1, ma) for 
a > 0, and Tm,(0,b) = 1 for b > 0. It follows that Tm(a,b) = 4-math (°+?+1) | Lattice 
paths in rectangles and other regions satisfy similar recursions, but the initial conditions 


change depending on the boundary of the region. 


Catalan Recursion. The Catalan numbers C,, = —+ en 


Hi sa satisfy the recursion C,, = 
ood Cr-1Cn—x for n > 0, with initial condition Co = 1. If a sequence (d,,) satisfies 


this recursion and initial condition, then d, = C), for all n. 


Catalan Objects. Examples of objects counted by Catalan numbers include Dyck paths, 
strings of balanced parentheses, binary trees, 231-avoiding permutations, and 7-avoiding 
permutations for any permutation 7 of {1, 2,3}. 
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Integer Partitions. An integer partition of n into k parts is a weakly decreasing sequence 
of k positive integers whose sum is n. These partitions are counted by p(n, k), which 
satisfies the recursion p(n,k) = p(n —1,k —1) + p(n—k,k) for n,k > 0, with initial 
conditions p(n,k) = 0 for k > n or k < 0, p(n,0) = 0 for n > 0, and p(0,0) = 1. Let 
p(n) be the number of integer partitions of n; then p(n) = So; p(n, k) and p(n) = 
Tj (-1)"™p(n — (3m — 1)/2) + p(n — m(3m + 1)/2)). 


Partition Diagrams. A partition u = (f1,..., 4x) can be visualized as a diagram where 
there are 1; left-justified squares in the ith row from the top. The conjugate partition p’ 
is found by interchanging rows and columns in the diagram of y. By taking conjugates, 
p(n, k) is the number of integer partitions of n with largest part &. The number of 


partitions whose diagrams fit in an a x b box is aay 


Set Partitions. A set partition of a set X is a set P of nonempty pairwise disjoint 
subsets of X (called the blocks of P) whose union is X. The Stirling number of the 
second kind, denoted S(n,k), counts set partitions of an n-element set into k blocks. 
We have S(n,k) = S(n —1,k —1) + k&S(n —1,k) for n,k > 0, with initial conditions 
S(0,0) = 1, S(n,0) = 0 for n > 0, and $(0,4) = 0 for k > 0. The Bell number B(n) 
counts all set partitions of an n-element set. We have B(n) = ‘ae (", )B(n-1-k), 
with initial condition B(0) = 1. 


Surjection Rule. The number of surjective functions from an n-element set onto a k- 
element set is k!.S'(n, k). 


Balls in Nonempty Boxes. When all boxes must be nonempty, there are: 
b!S(a, b) ways to put a labeled balls into 6 labeled boxes; 
(Gea ways to put a unlabeled balls into b labeled boxes; 
S(a,b) ways to put a labeled balls into b unlabeled boxes; 
p(a, b) ways to put a unlabeled balls into b unlabeled boxes. 


Equivalence Relations. A relation R on aset X is an equivalence relation iff R is reflexive 
(aRa for all a € X), symmetric (a Rb implies bRa for all a,b € X), and transitive (aRb 
and bRc imply aRc for all a,b,c € X). The set of equivalence classes of an equivalence 
relation on X is a set partition of X; so there is a bijection from the set of equivalence 
relations on X to the set of set partitions of X. The Stirling number S(n,k) is the 
number of equivalence relations on an n-element set with k equivalence classes. The 
Bell number B(n) is the number of equivalence relations on an n-element set. 


Rook Interpretations of Stirling Numbers. The Stirling number of the second kind S(n, k) 
is the number of placements of n — k non-attacking rooks on the triangular board 
A, = dg(n — 1,n — 2,...,3,2,1). The signless Stirling number of the first kind s’(n, k) 
is the number of placements of n — k non-attacking file rooks on A, (i.e., no two rooks 
appear in the same column). We have s’(n,k) = s(n —1,k — 1) + (n— 1)s'(n—-1,k) 
for 0 << k <n, with initial conditions s’(n,0) = 0 for n > 0 and s’(n,n) = 1 for n > 0. 
Moreover, s’(n, k) = WA eee eae 9 iptg+++tn_p- 


Stirling Number Identities. We have x” = v7 _. S(n,k)(2)Le, (t)tn= peg 8/(n, k)x*, 
and (2) n= ipo 8(n,k)a*, where s(n,k) = (—1)"**s'(n,k) is the signed Stirling 
number of the first kind. So Stirling numbers of both kinds appear as entries in transition 
matrices between monomial bases, rising factorial bases, and falling factorial bases for 
vector spaces of polynomials. The matrices with entries S(n,k) and s(n, k) are inverses 
of each other. 
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e Solving Recursions with Constant Coefficients. To solve the recursion ty = @\%p—1 + 
a22n—24+:+-+aqXn—a, find the roots of the characteristic polynomial rt—ayrt-1_.. ag. 
Each root r of multiplicity m yields m basic solutions 2, = n'’r” for 0 <i < m. Every 
solution of the recursion is a linear combination of the d basic solutions. To find the 
coefficients in the linear combination, use initial conditions for xg,...,%qg_ , to set up a 
system of linear equations that can be solved for the coefficients. 


Exercises 


2-1. Expand each of the following powers: (a) (2 + y)°; (b) (a —6)®; (c) (1+r)’. 

2-2. (a) Find the coefficient of x® in (32? +5)°. (b) Find the coefficient of x® in (273 + 9)°. 
2-3. Expand (4x — 3)® into a sum of monomials. 

2-4. Find the constant term in (2x — 2~1)®, 

2-5. Find the coefficient of a*b?cd* in (a+b+c¢+d)!*, where a,b,c, d are real variables. 
2-6. Find the coefficient of x? in (x? + # + 1). 


2-7. Suppose z € C satisfies 27 = 1 and z ¥ 1. Find a cubic polynomial having z + z~! as 
a root. 


2-8. Given real variables 21,...,2s, write the expansions of (21 + z2 +---+ 2s)? and (21 + 
zg +--++ z5)° without using multinomial coefficients. 

2-9. (a) Given arbitrary 3 x 3 matrices A, B, D, expand (A + B + D)3. (b) How does 
the answer to (a) simplify if the three matrices commute? (c) How does the answer to (a) 
simplify if we only know that AB = BA? 

2-10. (a) When we use the Non-Commutative Multinomial Theorem to expand (z1 + z2 + 
--++ 2.5)", how many terms will there be? 

(b) If all variables commute, how many terms will there be after collecting like terms? (For 
example, when s = n = 3 there are 10 terms.) 

2-11. Euler’s Identity states that e’’ = cost + isint for all real t, where i € C satisfies 
i? = —1. Use this and the Binomial Theorem to show that cos(3t) = 4 cos? t—3 cost. Derive 
a similar identity for sin(3t). 

2-12. Give a combinatorial proof of the Multinomial Theorem similar to the proof of the 
Binomial Theorem given in the text. 

2-13. (a) Give an algebraic proof of this analogue of Pascal’s Identity for multinomial 
coefficients: 


n n—1 n—1 n—-1 
i + feeef . 
N1,2,---5,Ms ny —1,n2,...,Ns nyi,ng—1,...,Ns N1,72,---,%s —1 


(b) Give a combinatorial proof of this identity based on multidimensional lattice paths. 


2-14. Give an algebraic proof of the Multinomial Theorem by induction on the power n, 
using the identity in the previous exercise. 

2-15. Prove the Multinomial Theorem using the Binomial Theorem and induction on s. 
What identity relating binomial coefficients and multinomial coefficients do you need in 
your proof? 

2-16. Generalized Distributive Law. Suppose R is a ring (possibly non-commutative), 
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S is a positive integer, Jj, Jo,...,J, are finite pairwise disjoint index sets, and for each 
Tk © Jk, @;, is a member of R. Prove: 


; aj, . ) Aj A gana * ) Qj, => ; Qj, Ajo srt aj,- 


jel j2E Ja js€EIs (j15--5s)EIL XX Ts 


Informally, this identity says that a product of sums expands into a sum of products of 
terms, where we build each term by choosing one factor from each sum and multiplying the 
chosen factors together. (The assumption that the index sets J, are pairwise disjoint can 
always be achieved by replacing each J, by {k} x Jy.) 

7 5 : . ab b b b 
2-17. (a) Give an algebraic proof of the identity (a) = Le recene ree (A (,.) ee (,.)- 
(b) Give a combinatorial proof of this identity. 
2-18. (a) Find 1+ (2/5)+(2/5)2+---+(2/5)8. (b) Find 1—1/34+1/9—1/27+---+1/6561. 
2-19. Evaluate each infinite series. (a) 1 + (3/4) + (3/4)? + (3/4)? +--+. (b) 1— (4/7) + 
(4/7)? — (4/7)3 +--+. (c) L4+r34+r%4+1r94---, where |r| < 1. (d) 2, e*”, where x < 0. 
2-20. Express each infinite repeating decimal as a rational number. 
(a) 0.2222--- (b) 2.61616161--- (c) 0.123123123--- (d) 0.428571428571---. 
2-21. (a) Prove algebraically that )>;_9(—1)*() = 0 for all n > 0. (b) Rewrite the identity 
in (a) as >, oaa (7) = Dek even (jz) and give a bijective proof. 
2-22. Prove )7j_9 (2) = 2” using lattice paths. 
2-23. Prove Cy) — Cc) using lattice paths. 


a, 


2-24. Given n € Zo, evaluate )lo<jchen (") Ce 


2-25. Given m,n € Zo, evaluate 74,4 fy 4.--th,, =n (Ki !ke! ++: Kall: 


2-26. Prove each identity by induction. 


(a) Sk =n(n+1)/2; (b) Sk? =n(n41)(Q2n+1)/6; (c) SOR =n?(n+1)7/4. 
k=1 


k=1 k=1 


2-27. For fixed n > 0, let R be the rectangle with vertices (0,0), (n,0), (0,7), and (n,n). 
Let S be the set of rectangles contained in R with sides parallel to the sides of R and with 
vertices at integer coordinates. By counting S in two ways, give a combinatorial proof of 
the formula for the sum of the first n cubes. 

2-28. Use the technique of §2.6 to evaluate: (a) )77_, k4; (b) Tp_, F°. 

2-29. Use Pascal’s Recursion to compute (?) for 0 < k <9 and (/”) for 0 <k < 10. 

2-30. Compute the ballot numbers T(a,7) for 0 < a < 7 by drawing a picture. 

2-31. For fixed k € Zyo, let a, be the number of n-letter words using the alphabet 
{0,1,...,4} that do not contain 00. Find a recursion and initial conditions for a,. Then 
compute a5. 

2-32. How many words in {0,1,2}° do not contain three zeroes in a row? 

2-33. How many lattice paths from (1,1) to (6,6) always stay weakly between the lines 
y = 2/5 and y = 52/2? 

2-34. How many lattice paths go from (1,1) to (8,8) without ever passing through a point 
(p,q) such that p and q are both prime? 

2-35. Find the number of lattice paths from (0,0) to (6,6) that stay weakly between the 
paths EENEENNEENNN and NNNENEENENEE and do not pass through (1,2) or (5,4). 
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2-36. Draw the diagrams of all integer partitions of n = 6 and n = 7. Indicate which 
partitions are conjugates of one another. 

2-37. Show that the Catalan number C,, counts integer partitions jz such that dg(~) C An. 
2-38. For each 7, list all permutations in SJ: (a) 7 = 123; (b) rT = 182; (c) 7 = 213; 
(d) r = 312; (e) 7 = 321. 

2-39. Compute p(8, 3) by direct enumeration and by using a recursion. 

2-40. Use Euler’s recursion to compute p(k) for 13 < k < 16 (see Figure 2.18). 

2-41. (a) List all the set partitions and rook placements counted by S$(5,2). (b) List all 
the set partitions and equivalence relations counted by B(4). (c) Draw all the file rook 
placements counted by s’(4, 2). 

2-42. Compute S(9,k) for 0 <k <9 and S(10,k) for 0 < k < 10 (use Figure 2.20). 

2-43. Compute the Bell number B(k) for 9 < k < 12 (use Figure 2.20). 

2-44. Compute s(8,k) for 0 < k < 8 (use Figure 2.23). 

2-45. Prove the identity k (7) = aed = (n—k+1)(,”,) algebraically and combinatorially 
(where 1 <k<n). 

2-46. Prove the identity (”) (*) = ) =) algebraically and combinatorially. 


Ss 


2-47. Prove that for all integers n,k,i > 0, eee (cc) ) = ey Gi) Cas 


2-48. (a) Evaluate >p_, k(n +1—k). (b) Evaluate 77) Oy Dy l- 

2-49. Suppose X is an n-element set. Count the number of relations R on X satisfying 
each property: (a) no restrictions on R; (b) R is reflexive on X; (c) R is irreflexive; (d) R 
is symmetric; (e) R is irreflexive and symmetric; (f) R is antisymmetric. 

2-50. Let X be a nine-element set and Y a four-element set. (a) Find the probability that a 
random function f : X — Y is surjective. (b) Find the probability that a random function 
g:Y — X is injective. 

2-51. Let p’(n, k) be the number of integer partitions of n with first part k. (a) Prove that 
p(n, k) = p'(n—-1,k-1)+p'(n—k,k) for n,k > 0. What are the initial conditions? (b) Use 
(a) to prove p’(n,k) = p(n,k) for all n,k > 0. 

2-52. Let Surj(n, k) be the number of surjections from an n-element set onto a k-element set. 
(a) Find a recursion and initial conditions for Surj(n, k). (b) Use (a) to prove Surj(n, k) = 
k1S(n,k). 

2-53. Verify equations (2.3), (2.4), and (2.5) by direct calculation for n = 3 and n = 4. 


2-54. (a) How many ways can we put a labeled balls in 6 unlabeled boxes? (b) What is the 
answer if each box can contain at most one ball? 


2-55. (a) How many ways can we put a unlabeled balls in 6 unlabeled boxes? (b) What is 
the answer if each box can contain at most one ball? 


2-56. (a) Find the rook placement associated to the set partition 


{{2, 5}, {1, 4,7, 10}, {3}, (6, 8}, {9}} 


by the bijection in Remark 2.56. (b) Find the set partition associated to the following rook 
placement: 
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2-57. Let f : {1,2,...,7} — {1,2,3} be the surjection given by f(1) = 3, f(2) = 3, 
f(3) =1, f(4) = 3, f(5) = 2, f(6) =3, f(7) =1. In the proof of the Surjection Rule 2.47, 
what choice sequence can be used to construct f? 

2-58. How many compositions of 20 only use parts of sizes 1, 3, or 5? 

2-59. Use the recursion 2.22 for multisets to prove by induction that the number of k- 


element multisets using an n-element alphabet is oa. 


2-60. Given a,b,c,n € Zso with a+ b+c = n, prove combinatorially that (aaa = 
DoalGe es eG 

2-61. Complete the proof of Theorem 2.27 by proving T,,(a, b) = bani aaa by induc- 
tion. 

2-62. (a) Show that |S}8?| = C,,. (b) Show that |$213| = C,,. (c) Show that |S3!7| = Cy. 
2-63. Convert the binary tree in Figure 2.11 to a: (a) 132-avoiding permutation; (b) 213- 
avoiding permutation; (c) 312-avoiding permutation. 

2-64. (a) Let G,, be the set of lists of integers (go, 91,---,9Gn—1) where go = 0, each g; > 0, 
and gi41 < gi +1 for all i < n—1. Prove that |G,| = C,. (b) For m € Zyo, let gh be 
the set of lists of integers (go, g1,---;Gn—1) Where go = 0, each g; > 0, and gi41 < gj +m 
for all i < n—1. Prove that ae = T,,(n,mn), the number of lattice paths from (0,0) to 
(n, mn) that never go below the line y = mz. 

2-65. Consider the 231-avoiding permutation w = 1524311761089. Use recursive bi- 
jections based on the Catalan recursion to map w to objects of the following kinds: (a) a 
Dyck path; (b) a binary tree; (c) a 312-avoiding permutation; (d) an element of G,, (see the 
previous exercise). 

2-66. Let 7 be the Dyck path NNENEENNNENNENNEEENENEEE. Use recursive bijec- 
tions based on the Catalan recursion to map 7 to objects of the following kinds: (a) a binary 
tree; (b) a 231-avoiding permutation; (c) a 213-avoiding permutation. 

2-67. Show that the number of possible rhyme schemes for an n-line poem using & different 
rhyme syllables is the Stirling number S(n,k). (For example, ABABCDCDEFEFGG is a 
rhyme scheme with n = 14 and k = 7.) 

2-68. Find explicit formulas for S(n,k) when k is 1, 2, n — 1, and n. Prove your formulas 
using counting arguments. 

2-69. Give a combinatorial proof of the identity kS(n,k) = yi (Sin —j,k—1), where 
l<k<n. 

2-70. Prove Cy, = a T(k,n—k)? for n> 1. 

2-71. Consider lattice paths that can take unit steps north (N), south (S), west (W), or 
east (E), with self-intersections allowed. How many such paths begin and end at (0,0) and 
have 10 steps? 

2-72. Use the recursions 2.45 and 2.55 and the ideas in Remark 2.34 to give a recursive 
definition of a bijection between rook placements counted by $’(n,k) and set partitions 
counted by S(n,k). Is this bijection the same as the bijection described in Remark 2.56? 
2-73. Give an algebraic proof of (2.4) by induction on n. 

2-74. Fix n € Zyo, let uw be an integer partition of length €(w) <n, and set yu, = 0 for 
l(u) <k <n. Let s’(u,k) be the number of placements of n — k non-attacking file rooks on 
the board dg(y’). (a) Find a summation formula for s’(~,k) analogous to 2.62. (b) Prove 
that 


(a + p1)(a + pa) +++ (@ + pin) = D> 8'(u, b)a*. 
k=0 


n 
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(c) For n = 7 and pu = (8,5,3,3, 1), find s’(u,k) forO<k < 7. 

2-75. Recall the Fibonacci numbers satisfy Fo = 0, Fi = 1, and F,, = Fy,-1 + Fr—2 for all 
n > 2. (a) Show that F,,41 is the number of compositions of n in which every part has size 
1 or 2. (b) Show that F;,42 is the number of subsets of {1,2,...,n} that do not contain two 
consecutive integers. 


2-76. (a) How many words in {0,1} do not contain consecutive zeroes and have exactly 
k ones? (b) Give a combinatorial proof that Fri = jez (";'). 


2-77. (a) Show that the sequence a, = F», satisfies the recursion ay, = 3a,—-1 — Gn—2 for 
n > 2. What are the initial conditions? (b) Show that a,41 is the number of words in 
{A,B, C}” in which A is never immediately followed by B. 

2-78. For n > 0, let a, be the number of words in {1,2,...,4}" in which 1 is never 
immediately followed by 2. Find and prove a recursion satisfied by the sequence (a, :n > 0). 
2-79. Give algebraic or combinatorial proofs of the following formulas involving the Fi- 
bonacci numbers F),. 

(a) e=0 Fy = Frt2—-1 

(b) 7029 Foevt = Fan 

(c) ok=o Fak = Fanti —1 

(d) peo hFe = Faye — Frog +2 
( 

( 


0) oD Fe = FaFavs 
f) Seo Ghai ret = Fan+p 

2-80. Give a combinatorial proof of equation (2.3) by interpreting both sides as counting a 
certain collection of functions. 


2-81. Let C,,, be the number of Dyck paths of order n that end with exactly k east steps. 


Prove the recursion . 
(he Ler 
Ch = Ch— kr: 
i. > ( k— 1, r ) . 


ri 


2-82. Find the general solution to each recursion. 
(a) @y = 8an-1 — 15¢p_2 

(b) Ln = 3Xn~1 + 44n_-2 

(c) fn = TLp—-1 — 6%n~2 
(d) 


Ln = —5@n-1 = 62n_2 


(a) Ln = Atn—1 — Ln-—2 

(b) tp = (7Tan-1 — 22n~-2)/6 
(c) Up = —2Xn-1 — Ln—2 

(d) tp = 4¢n-1 — 29%y~-2 


a) In = —Xn-1 + 12t,=9; wo = 4, r= —l. 
b) Ln = ADn—1 = 4%n_2, wo = 2, t= 3. 
C) Ln = 34n-1 — Un—2, Lo = 5, 11 = 0. 

d) Ln = 82n_-1 = 14%, _2, wo = 0, t= 1. 


(a) @ = —2ay_-1 + Llap_2 + 12¢,_3 — 362y~4. 
(b) ty = Cn—-1 + O2n-2 + Fn 3 + 2n—4. 
(c) 
( 


Ln = 6%yn_-1 — 38%n_—2 — 142%yn_3 — O%n_4. 
d) fn = —40n_1 — 6%n—2 — 44-3 — En—4. 
2-86. Prove Theorem 2.66. 
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2-87. Prove that Method 2.70 yields all solutions to the given recursion in the case where 
all roots of the characteristic polynomial are distinct. [Hint: You can use the fact that for 
distinct complex numbers r1,...,7q, the d x d matrix with i, j-entry rtd has determinant 
I].ep(va — To) # 0, hence this matrix is invertible. This is the Vandermonde Determinant 
Formula, which we prove in §12.11.] 
2-88. Let p be prime. Prove that ca) is divisible by p for 0 < k < p. Can you find a 
combinatorial proof? 
2-89. Fermat’s Little Theorem states that a? = a (mod p) for a € Zyo and p prime. Prove 
this by expanding a? = (1+1+---+1)? using the Multinomial Theorem (cf. the previous 
exercise). 
2-90. Ordered Set Partitions. An ordered set partition of a set X is a sequence P = 
(T1, To,..., T,) of distinct sets such that {T1, To,...,7,} is aset partition of X. Let B,(n) be 
the number of ordered set partitions of an n-element set. (a) Show By(n) = S>¢_, k!S(n, k) 
for n > 1. (b) Find a recursion relating B,(n) to values of B,(m) for m <n. (c) Compute 
B,(n) forO<n<5. 
2-91. (a) Let B,(n) be the number of set partitions of an n-element set such that no block 
of the partition has size 1. Find a recursion and initial conditions for B,(n), and use these 
to compute By(n) for 1 <n < 6. (b) Let $i(n,k) be the number of set partitions as in (a) 
with & blocks. Find a recursion and initial conditions for S1(n, k). 
2-92. Let pa(n, k) be the set of integer partitions of n with first part k and all parts distinct. 
Find a recursion and initial conditions for pa(n, k). 
2-93. Let po(n,k) be the set of integer partitions of n with first part k and all parts odd. 
Find a recursion and initial conditions for po(n, k). 
2-94. Let qg(n,k) be the number of integer partitions y of length k and area n such that 
p’ = pu (such partitions are called self-conjugate). Find a recursion and initial conditions for 
q(n, k). 
2-95. Verify the statement made after Definition 2.63 that any indexed collection of poly- 
nomials {p,(x) : n > 0} such that deg(p,) =n for all n is a basis for the real vector space 
of polynomials in one variable with real coefficients. 
2-96. Lah Numbers. A set partition with totally ordered blocks is a set partition 
{B,, Bo,..., By} together with a total ordering of the elements in each block B;. For 
1<k</n, let L(n,k) be the number of set partitions of [n] containing & totally ordered 
=1) nl 
blocks. Prove that L(n,k) = (273) 4. 
2-97. Define L(n,k) as in the previous exercise. Prove: for all n > 1, 


n 


(x)tn= >) (@e L(n, k). 


k=1 
2-98. Use the previous exercise to find a summation formula expressing Lah numbers in 
terms of Stirling numbers of the first and second kind. 
2-99. Complete the proof of Theorem 2.51 by verifying that: (a) ¢(P) € B for all P € A; 
(b) ¢'(R) € A for all R € B; (c) do d' = idg; (d) df od =idy. 
2-100. Consider a product x1 X %2 X-+-X Zp, where the binary operation x is not necessarily 


associative. Show that the number of ways to parenthesize this expression is the Catalan 
number C,,_,. For example, the five possible parenthesizations when n = 4 are 


(((a1 X %2) X #3) X x4), ((@1 X 2) x (w3 x @4)), (X41 X ((L2 X 43) X @4)), 


(a1 Xx (a2 x (a3 X v4))), ((41 Xx (@2 X ¥3)) X a4). 
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2-101. Let f,g : R— R have derivatives of all orders. Recall the Product Rule for Deriva- 
tives: D( fg) = D(f)g+fD(g), where D denotes differentiation with respect to x. (a) Prove 
that the nth derivative of fg is given by 


n 


D(a) = 9 (7) DAD") 


k=0 


(b) Find and prove a similar formula for D" (fi fo--- fs), where fi,...,f; have derivatives 
of all orders. 


2-102. Prove: for all n > 0, 7y5 (a Cs) =A". 
2-103. Prove: for all n > 0, oyez (7, 

2-104. Prove: for all n > 1, Yy¢_, kj, 
2-105. Prove: for all n > 0, Theo (3) Fe = Fan. 
2-106. Prove: for alla > 1, peo (44/3 2" Fe 


Notes 


Gould’s book [51] contains an extensive, systematic list of binomial coefficient identities. 
More recently, Wilf and Zeilberger developed an algorithm, called the WZ-method, that 
can automatically evaluate many hypergeometric summations (which include binomial co- 
efficient identities) or prove that such a summation has no closed form. This method is 
described in [99]. For more information on hypergeometric series, see [75]. 

A wealth of information about integer partitions, including a discussion of the Hardy— 
Rademacher—Ramanujan summation formula for p(n), may be found in [5]. There is a vast 
literature on pattern-avoiding permutations; for more information on this topic, consult [14]. 

A great many combinatorial interpretations have been discovered for the Catalan num- 
bers C,. A partial list appears in Exercise 6.19 of [121, Vol. 2]; this list continues in the 
“Catalan Addendum,” available online at 


http: //www-math.mit.edu/~rstan/ec/catadd. pdf 
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Counting Problems in Graph Theory 


Graph theory is a branch of discrete mathematics that studies networks composed of a 
number of sites (vertices) linked together by connecting arcs (edges). This chapter studies 
some enumeration problems that arise in graph theory. We begin by defining fundamental 
graph-theoretic concepts such as walks, paths, cycles, vertex degrees, connectivity, forests, 
and trees. This leads to a discussion of various counting problems involving different kinds 
of trees. Aided by ideas from matrix theory, we count walks in a graph, spanning trees of 
a graph, and Eulerian tours. We also investigate the chromatic polynomial of a graph; this 
polynomial counts the number of ways of coloring the vertices such that no two vertices 
joined by an edge receive the same color. 


(I 


3.1 Graphs and Digraphs 


Intuitively, a graph is a mathematical model for a network consisting of a collection of 
nodes and connections that link certain pairs of nodes. For example, the nodes could be 
cities and the connections could be roads between cities. The nodes could be computers and 
the connections could be network links between computers. The nodes could be species in an 
ecosystem and the connections could be predator-prey relationships between species. The 
nodes could be tasks and the connections could be dependencies among the tasks. There 
are many such applications that lead naturally to graph models. We now give the formal 
mathematical definitions underlying such models. 


3.1. Definition: Graphs. A graph is an ordered triple G = (V, E,€), where: V = V(G) is 
a finite, nonempty set called the vertex set of G; E = E(G) is a finite set called the edge set 
of G; and «: E + P(V) is a function called the endpoint function such that, for all e € E, 
e(e) is either a one-element subset of V or a two-element subset of V. If e(e) = {v}, we call 
the edge e a loop at vertex v. If e(e) = {v, w}, we call v and w the endpoints of e and say 
that e is an edge from v to w. We also say that v and w are adjacent in G, v and w are 
joined by e, and e is incident to v and w. 


We visualize a graph G = (V, E,e) by drawing a collection of dots labeled by the elements 
v € V. For each edge e € EF with e(e) = {v,w}, we draw a line or curved arc labeled e 
between the two dots labeled v and w. Similarly, if e(e) = {v}, we draw a loop labeled e 
based at the dot labeled v. 


3.2. Example. The left drawing in Figure 3.1 represents the graph defined formally by the 
ordered triple 
Gy = ({1, 2, 3, 4, 5}, {a, b, Cc, d, e, fe g; h, i}, é), 


where «€ acts as follows: 


e(a) = {1,4}, €(b) = : 


Te €(c) = {2,3}, e(d) = {1,2}, e(e) = {1,2}, 
e(f) ={3},  e(g) = {2,5} 


» e€(h) = {4,5}, e(t) = {4,5}. 
103 
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FIGURE 3.1 
A graph, a simple graph, and a digraph. 


Edge f is a loop at vertex 3; edges h and 7 both go between vertices 4 and 5; vertices 1 and 
4 are adjacent, but vertices 2 and 4 are not. 


In many applications, there are no loop edges, and there is never more than one edge 
between the same two vertices. This means that the endpoint function € is a one-to-one 
map into the set of two-element subsets of V. So we can identify each edge e with its set of 
endpoints ¢(e). This leads to the following simplified model in which edges are not explicitly 
named and there is no explicit endpoint function. 


3.3. Definition: Simple Graphs. A simple graph is a pair G = (V,E), where V is a 
finite nonempty set and E is a set of two-element subsets of V. We continue to use all the 
terminology introduced in Definition 3.1. 


3.4. Example. The center drawing in Figure 3.1 depicts the simple graph G2 with vertex 
set V(G2) = {0,1,2,3,4,5,6} and edge set 


E(G2) = {{1, 2}, {2,3}, (3, 4}, {4, 5}, {5, 6F, {1, 6}, {2, 6}, {1, 4} }. 


To model certain situations (such as predator-prey relationships, or one-way streets in 
a city), we need to introduce a direction on each edge. This leads to the notion of a digraph 
(directed graph). 


3.5. Definition: Digraphs. A digraph is an ordered triple D = (V,E,¢), where V is a 
finite nonempty set of vertices, E is a finite set of edges, ande: EF + V x V is the endpoint 
function. If e(e) is the ordered pair (v,w), we say that e is an edge from v to w. 


In a digraph, an edge from v to w is not an edge from w to v when v 4 w, since 
(v,w) 4 (w,v). On the other hand, in a graph, an edge from v to w is also an edge from w 
to v, since {v, w} = {w, v}. 


3.6. Example. The right drawing in Figure 3.1 displays a digraph G3. In this digraph, 
€(j) = (4,5), e(a) = (1,1), and so on. There are three edges from 2 to 3, but no edges from 
3 to 2. There are edges in both directions between vertices 1 and 5. 


As before, we can eliminate specific reference to the endpoint function of a digraph if 
there are no multiple edges with the same starting vertex and ending vertex. 
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3.7. Definition: Simple Digraphs. A simple digraph is an ordered pair D = (V, E), 
where V is a finite, nonempty set and EF is a subset of V x V. Each ordered pair (v,w) € E 
represents an edge in D from v to w. Note that we do allow loops (v = w) in a simple 
digraph. 


When investigating structural properties of graphs, the names of the vertices and edges 
are often irrelevant. The concept of graph isomorphism lets us identify graphs that are the 
same except for the names used for the vertices and edges. 


3.8. Definition: Graph Isomorphism. Given two graphs G = (V,E,e) and H = 
(W,F,7), a graph isomorphism from G to H consists of two bijections f : V > W and 
g : E > F such that, for all e € E, if e(e) = {v,w} then n(g(e)) = {f(v), f(w)} (we 
allow v = w here). Digraph isomorphisms are defined similarly: e(e) = (v,w) implies 
n(g(e)) = (f(v), f(w)). We say G and H are isomorphic, written G © H, iff there exists a 
graph isomorphism from G to H. 


In the case of simple graphs G = (V, £) and H = (W, F), a graph isomorphism can be 
viewed as a bijection f : V — W that induces a bijection between the edge sets F and F. 
More specifically, this means that for all v,w € V, {v,w} © E iff {f(v), f(w)} € F. 


3.9. Example. Let G = (V,E) = ({1, 2,3, 4}, {{1, 2}, {1, 3}, {1,4}}) and H = (W,F) = 
({a, b,c, d}, {{a, c}, {b,c}, {c, d}}). The map f : V > W such that f(1) = c¢, f(2) =a, 
f(3) = b, and f(4) = d is a graph isomorphism, as is readily verified, so G © H. In fact, 
for any bijection g : V > W, g is a graph isomorphism iff g(1) = c. By the Product Rule, 
there are 3! = 6 graph isomorphisms from G to H. 


(NR 


3.2. Walks and Matrices 


We can travel through a graph by following several edges in succession. Formalizing this 
idea leads to the concept of a walk. 


3.10. Definition: Walks, Paths, Cycles. Let G = (V,F,¢) be a graph or digraph. A 
walk in G is a sequence 


W= (U0, €1, U1, €2, 2,-+-; Us—1) Es; Us) 


where s > 0, v; € V for alli, e; € E for all 7, and e; is an edge from v;_ 1 to uv; for 1 <i < s. 
We say that W is a walk of length s from vo to vs. The walk W is closed iff vp = vs. The 
walk W is a path iff the vertices vp, v1,...,Us are pairwise distinct (which forces the edges 
e; to be distinct as well). The walk W is a cycle iff s > 0, v1,...,Us are distinct, €1,...,€s 
are distinct, and vo = vs. A k-cycle is a cycle of length k. In the case of simple graphs and 
simple digraphs, the edges e; are determined uniquely by their endpoints. So, in the simple 
case, we can regard a walk as a sequence of vertices (vo, v1,...,Us) such that there is an 
edge from v;_; to v; in G for 1 <i<-s. 


3.11. Remark. When considering cycles in a digraph, we often identify two cycles that are 
cyclic shifts of one another (unless we need to keep track of the starting vertex of the cycle). 
Similarly, we identify cycles in a graph that are cyclic shifts or reversals of one another. 


3.12. Example. In the graph G, from Figure 3.1, 
Wi = (2,c,3, f,3, 0,4, h, 5, 7, 4, 7, 5) 
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FIGURE 3.2 
Digraph used to illustrate adjacency matrices. 


is a walk of length 6 from vertex 2 to vertex 5. In the simple graph G2 in the same figure, 
W2 = (1,6, 2,3,4,5) is a walk and a path of length 5, whereas C = (6,5, 4,3,2,6) is a 5- 
cycle. We often identify C with the cycles (5, 4,3, 2,6,5), (6,2, 3,4, 5,6), etc. In the digraph 
G3, 

W3 = (1, a, 1, 9,5, h,2,m, 4, j, 5, h, 2, d,3) 


is a walk from vertex 1 to vertex 3; (5,h,2,m,4,7,5) is a 3-cycle; (4,1,4) is a l-cycle; and 
(5, f,1,9,5) is a 2-cycle. Observe that 1-cycles are the same as loop edges, and 2-cycles 
cannot exist in simple graphs. For any vertex v in a graph or digraph, (v) is a walk of 
length zero from v to v, which is a path but not a cycle. 


We can now formulate our first counting problem: how many walks of a given length 
are there between two given vertices in a graph or digraph? We will develop an algebraic 
solution to this problem in which concatenation of walks is modeled by multiplication of 
certain matrices. 


3.13. Definition: Adjacency Matrix. Let G be a graph or digraph with vertex set 
X = {a :1<%4< n}. The adjacency matrix of G (relative to the given indexing of the 
vertices) is the n x n matrix A whose i, j-entry A(z, 7) is the number of edges in G from 2; 
to vj. 


3.14. Example. The adjacency matrix for the digraph G in Figure 3.2 is 


101000 
000110 
000010 

4=)90 03010 
O.1.0.9 60 0 
000002 


3.15. Example. If G is a graph, edges from v to w are the same as edges from w to v. 
So, the adjacency matrix for G is a symmetric matrix (A(i,7) = A(j,#) for all 2,7). If G 
is a simple graph, the adjacency matrix consists of all 1’s and 0’s with zeroes on the main 
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2 5 
3 6 


FIGURE 3.3 
A simple graph. 


diagonal. For example, the adjacency matrix of the simple graph in Figure 3.3 is 


00011 1 
00011 1 
00011 1 
1 11 0 0 0 
1 11 0 0 0 
1 11 0 0 0 


Recall from linear algebra the definition of the product of two matrices. 


3.16. Definition: Matrix Multiplication. Suppose A is an m x n matrix and B is an 
n xX p matrix. Then AB is the m x p matrix whose i, j-entry is 


= SAGs B(k,j) forl<i<m,1<j<p. (3.1) 


Matrix multiplication is associative, so we can write a product of three or more (compat- 
ible) matrices without any parentheses. The next theorem gives a formula for the general 
entry in such a product. 


3.17. Theorem: Product of Several Matrices. Assume A;,...,A,; are matrices such 
that A; has dimensions n;_, x nj. Then A, A2--- A; is the no X n, matrix whose ko, k,-entry 
is 
Ns—1 
(Ay Ag: ++ As)(ko, ks te a - S© Ai (ko, ki) Aa (ki, k2) As (ke, ks) +++ As(Ks—1, ks) 
ky=lko=1 — kg_1=1 
(3.2) 
for all kg and k, such that 1 < ko < no andl <k, < ng. 


Proof. We use induction on s. The case s = 1 is immediate, and the case s = 2 is the 
definition of matrix multiplication (after a change in notation). Assume s > 2 and that 
(3.2) is known to hold for the product B = A,Ag---As—1. We can think of the given 
product A; A2---As—1A, as the binary product BA,. Therefore, using (3.1), the ko, ks- 
entry of A, Ag--- A, is 

Ns—1 
(BAs)(ko, ks) = > B(ko,k)As(k, ks) 

Ns-1 Ns—2 


= S- a > Ai (ko, ki) A2(ki, ko) ++ As—i(ks—2,k) | As(k, ks) 


k=1 ky=1 ks-2=1 


Ns—2 Ms-1 


ae n> S— Ai (ko, k1)Aa(ki, ka) +++ As—1(Ks—2, k)As(k, ks) 


ky=1 -2=1 k=1 
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Replacing k by k,_1 in the innermost summation, we obtain the result. O 


Taking all A;’s in the theorem to be the same matrix A, we obtain the following formula 
for the entries of the powers of a given square matrix. 


3.18. Corollary: Powers of a Matrix. Suppose A is an n x n matrix. For each integer 
s > 0 and all i,j in the range 1 < i,j <n, 


AS(i,j) = So) $2 A(é, kx) Aha, he) +++ A(Ris2, Bist) A(ks— a5 J) (3.3) 


ky=1 ks-1=1 


The preceding formula may appear unwieldy, but it is precisely the tool we need to count 
walks in graphs. 


3.19. The Walk Rule. Let G be a graph or digraph with vertex set X = {11,...,Un}, 
and let A be the adjacency matrix of G. For all 7,7 between 1 and n and all s > 0, the 
i, j-entry of A® is the number of walks of length s in G from 2; to x;. 


Proof. The result holds for s = 0, since A° = J, (the n x n identity matrix) and there is 
exactly one walk of length zero from any vertex to itself. Now suppose s > 0. A walk of 
length s from x; to x; will visit s — 1 intermediate vertices (not necessarily distinct from 
each other or from 2; or x;). Let (@;,@%,,-.-,Uk,_,, 7) be the ordered list of vertices visited 
by the walk. To build such a walk, we choose any edge from «x; to xp, in A(i, ki) ways; then 
we choose any edge from xx, to x, in A(ki,k2) ways; and so on. By the Product Rule, the 
total number of walks associated to this vertex sequence is A(i, k1)A(k1, k2)--- A(ks-1, 7). 
This formula holds even if there are no walks with this vertex sequence, since some term 
in the product will be zero in this case. Applying the Sum Rule produces the right side of 
(3.3), and the result follows. oO 


3.20. Example. Consider again the adjacency matrix A of the digraph G in Figure 3.2. 
Some matrix computations show that 


oo: 5 :o oO: — 
COrRrF FO 
OOO OWrF 
OrFnNnNnN CO 
OWWOrRF 
rooOoCcneoo 
oococoorF 
OWWOrFRF 
GDWaanr 
GOON r WP 
ONwWwWwWOae 
OSS OOS 


So, for example, there are six walks of length 2 from vertex 5 to vertex 3, and there are 
seven walks of length 3 that start and end at vertex 4. By looking at A® = (A®)°, we find 
that there are 1074 walks of length 9 from vertex 2 to vertex 4, but only 513 walks of length 
9 from vertex 4 to vertex 2. 


(I 


3.3. Directed Acyclic Graphs and Nilpotent Matrices 


Next we consider the question of counting all walks (of any length) between two vertices 
in a digraph. The question is uninteresting for graphs, since the number of walks between 
two distinct vertices v,w in a graph is either zero or infinity. This follows since a walk is 
allowed to repeatedly traverse a particular edge along a path from v to w, which leads to 
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arbitrarily long walks from v to w. Similarly, if G is a digraph that contains a cycle, we 
obtain arbitrarily long walks between two vertices on the cycle by going around the cycle 
again and again. To rule out these possibilities, we restrict attention to the following class 
of digraphs. 


3.21. Definition: DAGs. A DAG is a digraph with no cycles. (The acronym DAG stands 
for “directed acyclic graph.” ) 


To characterize adjacency matrices of DAGs, we need another concept from matrix 
theory. 


3.22. Definition: Nilpotent Matrices. An n x n matrix A is called nilpotent iff A®* = 0 
for some integer s > 1. The least such integer s is called the index of nilpotence of A. 


Note that if A* = 0 and t > s, then A’ = 0 also. 


3.23. Example. The zero matrix is the unique n x n matrix with index of nilpotence 1. The 
0 1 


0 0 | is zero, so A is nilpotent of index 2. Similarly, 


square of the nonzero matrix A = | 


given any real x, y, z, the matrix 


satisfies 


and B?=0, 


so we obtain examples of matrices that are nilpotent of index 3. The next result generalizes 
this example. 


3.24. Theorem: Nilpotence of Strictly Triangular Matrices. Suppose A is an n x n 
strictly upper-triangular matrix, which means A(i, 7) = 0 for all i > 7. Then A is nilpotent 
of index at most n. A similar result holds for strictly lower-triangular matrices. 


Proof. It suffices to show that A” is the zero matrix. Fix ko and k,, between 1 and n. By 
(3.3), we have 


PoC See 3 A(ko, k1)A(k1, ke) ++ --A(Kn—15 ken): 


ky=1 kn-1=1 


We claim that each term in this sum is zero. Otherwise, there would exist ky1,...,ky—1 such 
that A(kz-1, kz) 4 0 for each t between 1 and n. But since A is strictly upper-triangular, 
we would then have 

ko < ky < ko < +++ < ky. 


This cannot occur, since every k, is an integer between 1 and n. O 
The next theorem reveals the connection between nilpotent matrices and DAGs. 


3.25. Theorem: DAGs and Nilpotent Matrices. Let G be a digraph with vertex set 
X = {21,...,2%,} and adjacency matrix A. G is a DAG iff A is nilpotent. When G is a 
DAG, there exists an ordering of the vertex set X for which A is a strictly lower-triangular 
matrix. 
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Proof. Assume first that G is not a DAG, so that G has at least one cycle. Let 2; be any 
fixed vertex involved in this cycle, and let c > 1 be the length of this cycle. By going around 
the cycle zero or more times, we obtain walks from x; to x; of lengths 0,c,2c,3c,.... By 
the Walk Rule, it follows that the (i,i)-entry of A** is at least 1, for all k > 0. This fact 
prevents any positive power of A from being the zero matrix, so A is not nilpotent. 

Conversely, assume that A is not nilpotent. Then, in particular, A” 4 0, so there exist 
indices ko, k, with A"(ko, kn) 4 0. Using the Walk Rule again, we deduce that there is a 
walk in G visiting a sequence of vertices 


(Gs eae tee ER cn he) 


Since G has only n vertices, not all of the n+1 vertices listed here are distinct. If we choose 
7 minimal and then j > 7 minimal such that r,, = rz,, then there is a cycle in G visiting 
the vertices (@%,,2k,41,---,Tk;). So G is not a DAG. 

We prove the statement about lower-triangular matrices by induction on n. A DAG with 
only one vertex must have adjacency matrix [0], so the result holds for n = 1. Suppose n > 1 
and the result is known for DAGs with n — 1 vertices. Create a walk (vo, €1, U1, €2, V2,---) 
in G by starting at any vertex and repeatedly following any edge leading away from the 
current vertex. Since G has no cycles, the vertices v; reached by this walk are pairwise 
distinct. Since there are only n available vertices, our walk must terminate at a vertex vu; 
with no outgoing edges. Let x = v;. Deleting x and all edges with x as one endpoint 
produces an (n — 1)-vertex digraph G’ that is also a DAG, as one immediately verifies. 
By induction, there is an ordering x4,...,x/, of the vertices of G’ such that the associated 
adjacency matrix A’ is strictly lower-triangular. Now, relative to the ordering x, 25,..., x}, 
of the vertices of G, the adjacency matrix of G has the form 


where each * denotes some nonnegative integer. This matrix is strictly lower-triangular. UO 
The next result leads to a formula for counting walks of any length in a DAG. 


3.26. Theorem: Inverse of J — A for Nilpotent A. Suppose A is a nilpotent n x n 
matrix with A* = 0. Let I be the n x n identity matrix. Then J — A is an invertible matrix 
with inverse 


(I-A) =I+A+A?4+ APH + AS?, 


Proof. Let B= I+ A+ A?+---+ A%!. By the distributive law for matrices, 


(I—A)B =IB-—AB=(I+A+A?+.:>+ A!) - (A+ A? 4++--+ A214 4%) ST - A’. 


Since A® = 0, we see that (J — A)B = I. A similar calculation shows that B(I — A) = I. 
Therefore B is the two-sided matrix inverse of I — A. O 


3.27. Remark. The previous result for matrices can be remembered by noting the analogy 
to the Geometric Series Formula for real or complex numbers: whenever |r| < 1, 


1 


Gan SS Steer be pi ter 
=f 
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FIGURE 3.4 
Example of a DAG. 


3.28. The Path Rule for DAGs. Let G be a DAG with vertex set {21,...,2,} and 
adjacency matrix A. For all i,j between 1 and n, the total number of paths from 2; to a; 
in G (of any length) is the i, j-entry of (I — A)~?. 


Proof. By the Walk Rule, the number of walks of length t > 0 from 2; to x; is A‘(i,j). 
Because G is a DAG, we have A’ = 0 for all t > n. By the Sum Rule, the total number 
of walks from 2; to x; is Sie At(i,j). By Theorem 3.26, this number is precisely the 
i, j-entry of (I — A)~!. Finally, one readily confirms that every walk in a DAG must in fact 
be a path. O 


3.29. Example. Consider the DAG shown in Figure 3.4. Its adjacency matrix is 


001 10 0 1 0 

000 11 1 0 0 

0000001 1 

A= 000 00 1 1 0 

000 00 2 0 0 

000 0002 1 

000 00 00 1 

000 00 0 0 0 

Using a computer algebra system, we compute 

103110315 7 
01011 4 9 18 
0010001 2 
2 0001013 4 
GAY =1000012 4 6 
000 00 1 2 8 
0000001 1 
000 00 0 0 1 


So, for example, there are 13 paths from vertex 2 to vertex 8, and 4 paths from vertex 5 to 
vertex 7. 
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3.4 Vertex Degrees 


The next definition introduces notation that records how many edges lead into or out of 
each vertex in a digraph. 


3.30. Definition: Indegree and Outdegree. Let G = (V, Z,«) be a digraph. For each 
v €V, the outdegree of v, denoted outdege(v), is the number of edges e € E leading away 
from v; the indegree of v, denoted indegg(v), is the number of edges e € E leading to v. A 
source is a vertex of indegree zero. A sink is a vertex of outdegree zero. 


3.31. Example. Let G3 be the digraph on the right in Figure 3.1. We have 


(indegg,(1),...,indegg,(7)) = (2,2,4,2,2,0, 1); 
outdegg, (1),..., outdega, (7 = (3,4,0,3,2,1,0). 
G3 G3 


Vertex 6 is the only source, whereas vertices 3 and 7 are sinks. A loop edge at v contributes 
1 to both the indegree and outdegree of v. The sum of all the indegrees is 13, which is 
also the sum of all the outdegrees, and is also the number of edges in the digraph. This 
phenomenon is explained in the next theorem. 


3.32. Theorem: Degree Sum Formula for Digraphs. In any digraph G = (V, Fe), 


S > indegg(v) = |E| = S— outdegg(v). 


vEV vEV 


Proof. For each v € V, let E, be the set of edges e € E ending at v, meaning that 
e(e) = (w,v) for some w € V. By definition of indegree, |E,| = indegg(v) for each v € V. 
The set F of all edges is the disjoint union of the sets E, as v varies through V. By the 
Sum Rule, |E| = )) cy |u| = VUyey indegg(v). The formula involving outdegree is proved 
similarly. O 


Next we give analogous definitions and results for graphs. 


3.33. Definition: Degree. Let G = (V,E,«) be a graph. For each vu € V, the degree of 
v in G, denoted degg(v), is the number of edges in F having v as an endpoint, where any 
loop edge at v is counted twice. The degree multiset of G, denoted deg(G), is the multiset 
[dega(v) : vu € VJ. G is called k-regular iff every vertex in G has degree k. G is regular iff G 
is k-regular for some k > 0. 


3.34. Example. For the graph G in Figure 3.1, we have 
(degg, (1),.-.,degg, (5)) = (3,4,4,4,3);  deg(G1) = [4, 4, 4, 3, 3). 


The graph in Figure 3.3 is 3-regular. In both of these graphs, the sum of all vertex degrees 
is 18, which is twice the number of edges in the graph. 


3.35. Theorem: Degree Sum Formula for Graphs. For any graph G = (V, E,), 
S © degg(v) = 2|E]. 
vEV 


Proof. First assume G has no loop edges. Let X be the set of pairs (v,e) such that v € V, 
e € £, and v is an endpoint of e. We count X in two ways. On one hand, X is the disjoint 
union of sets X,, where X, consists of pairs in X with first component v. The second 
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component of such a pair can be any edge incident to v, so |X,| = degg(v). By the Sum 
Rule, |X| = o,ey |Xvl = vey degg(v). On the other hand, we can build an object (v, e) 
in X by first choosing e (in any of |E| ways) and then choosing one of the two endpoints v 
of e (2 ways, since G has no loop edges). So |X| = 2|E| by the Product Rule, and the result 
holds in this case. 

Next, if G has k loop edges, let G’ be G with these loops deleted. Then 


2 degg(v) = Se degg:(v) + 2k, 


vEeV vEV 


since each loop edge increases some vertex degree in the sum by 2. Using the result for 
loopless graphs, 

Yo dege(v) = 2|E(G’)| + 2k = 2|E(G)], 

vEeV 


since G has k more edges than G’. O 
Vertices of low degree are given special names. 


3.36. Definition: Isolated Vertices and Leaves. An isolated vertex in a graph is a 
vertex of degree 0. A leaf is a vertex of degree 1. 


The following result will be used later in our analysis of trees. 


3.37. The Two-Leaf Lemma. Suppose G is a graph. One of the following three alterna- 
tives must occur: (i) G has a cycle; (ii) G has no edges; or (iii) G has at least two leaves. 


Proof. Suppose that G has no cycles and G has at least one edge; we prove that G has two 
leaves. Since G has no cycles, we can assume G is simple. Let P = (vo, v1,..., Us) be a path 
of maximum length in G. Such a path exists, since G has only finitely many vertices and 
edges. Observe that s > 0 since G has an edge, and vp 4 vs. Note that deg(v,) > 1 since 
s > 0. Assume v, is not a leaf. Then there exists a vertex w 4 vs_; that is adjacent to vs. 


Now, w is different from all v; with 0 < 7 < s—1, since otherwise (v;,vj;41,...,Us,wW = U;) 
would be a cycle in G. But this means (vo, 1,...,Us,wW) is a path in G longer than P, 
contradicting maximality of P. So v, must be a leaf. A similar argument shows that vo is 
also a leaf. O 


DS 


3.5 Functional Digraphs 


We can obtain structural information about functions f : V - V by viewing these functions 
as certain kinds of digraphs. 


3.38. Definition: Functional Digraphs. A functional digraph on V is a simple digraph 
G with vertex set V such that outdeg¢(v) = 1 for all v € V. 


A function f : V — V can be thought of as a set Ey of ordered pairs such that for each 
x € V, there exists exactly one y € V with (z,y) € Ey, namely y = f(x). Then (V, Ey) isa 
functional digraph on V. Conversely, a functional digraph G = (V, F) determines a unique 
function g: V + V by letting g(v) be the other endpoint of the unique edge in EF departing 
from v. These comments establish a bijection between the set of functions on V and the set 
of functional digraphs on V. 
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FIGURE 3.5 
A functional digraph. 


3.39. Example. Figure 3.5 displays the functional digraph associated to the following 
function: 


fd)=15; f(2)=16; f(3)=8 f(4)=17; f(5)=5; 
f(6)=5 f(i=4 f(8)=3; f(9)=6; fl0)=4; 
f(1l)=10; f(2)=4; f(13)=10; f(14)=1; (15) = 12; 
f(16)=1;  f(17)=15. 


Our next goal is to understand the structure of functional digraphs. Consider the digraph 
G = (V, E) shown in Figure 3.5. Some of the vertices in this digraph are involved in cycles, 
which are drawn at the bottom of the figure. These cycles have length 1 or greater, and any 
two distinct cycles involve disjoint sets of vertices. The other vertices in the digraph all feed 
into these cycles at different points. We can form a set partition of the vertex set of the 
digraph by collecting together all vertices that feed into a particular vertex on a particular 
cycle. Each such collection can be viewed as a smaller digraph that has no cycles. We will 
show that these observations hold for all functional digraphs. To do this, we need a few 
more defintions. 


3.40. Definition: Cyclic Vertices. Let G be a functional digraph on V. A vertex v € V 
is called cyclic iff v belongs to some cycle of G; otherwise, v is called acyclic. 


3.41. Example. The cyclic elements for the functional digraph in Figure 3.5 are 3, 4, 5, 
8, 12, 15, and 17. 


Let f : V — V be the function associated to the functional digraph G. Then v € V 
is cyclic iff f*(v) = v for some i > 1, where f* denotes the composition of f with itself 
i times. This fact follows since the only possible cycle involving v in G must look like 


(v, F(x), FP? @), Pv), +»). 


3.42. Definition: Rooted Trees. A digraph G is called a rooted tree with root vo iff G is 
a functional digraph and vp is the unique cyclic vertex of G. 


3.43. Theorem: Structure of Functional Digraphs. Let G be a functional digraph on 
a nonempty set V with associated function f : V — V. Let CC V denote the set of cyclic 
vertices of G. C is nonempty, and each v € C belongs to exactly one cycle of G. Also, there 
exists a unique indexed set partition {S, : v € C} of V such that the following hold for all 
v € C: (i) v € Sy; (ii) « € Sy and x ¥ v implies f(x) € Sy; (iii) if g: S, + Sy is defined 
by setting g(a) = f(x) for « £ uv and g(v) = v, then the functional digraph of g is a rooted 
tree with root v. 


Counting Problems in Graph Theory 115 


Proof. First, suppose v € C. Since every vertex of G has exactly one outgoing edge, the 
only possible cycle involving v must be (v, f(v), f7(v),..., f*(v) = v). So each cyclic vertex 
(if any) belongs to a unique cycle of G. This implies that distinct cycles of G involve disjoint 
sets of vertices and edges. 

Next we define a surjection r: V — C. The existence of r will show that C 4 0, since 
V #Q. Fix u € V. By repeatedly following outgoing arrows, we obtain for each k > 0a 
unique walk (u = uo, U1, U2,-..,Ux) in G of length k. Since V is finite, there must exist i < j 
with u; = u;. Take i minimal and then j minimal with this property; then (uj, wi41,..-,U;) 
is a cycle in G. We define r(u) = u;, which is the first element on this cycle reached from 
u. It can be checked that r(u) = u for all wu € C; this implies that r is surjective. On the 
other hand, if u ¢ C, the definition of r shows that r(u) = r(u1) = r(f(u)). 

How shall we construct a set partition with the stated properties? For each v € C, 
consider the fiber Sy = r-t({v}) = {w € V : r(w) = v}; then {S, : v € C} is a set 
partition of V indexed by C. The remarks at the end of the last paragraph show that this 
set partition satisfies (i) and (ii). To check (iii) for fixed v € C, first note that the map g 
defined in (iii) does map S, into S$, by (i) and (ii). Suppose W = (wo, wi,..., We) is a cycle 
in the functional digraph for g. Since r(wo) = v, we will eventually reach v by following 
outgoing arrows starting at wo. On the other hand, following these arrows keeps us on the 
cycle W, so some w; = v. Since g(v) = v, the only possibility is that W is the 1-cycle (v). 
Thus (iii) holds for each v € C. 

To see that {S, : v € C} is unique, let P = {T, : v € C} be another set partition 
with properties (i), (ii), and (iii). It is enough to show that S, C T, for each v € C. Fix 
v € C and z € Sy. By (ii), every element in the sequence (z, f(z), f?(z),...) belongs to the 
same set of P, say Ty. Then v = r(z) = f*(z) € Tw, so (i) forces w = v. Thus z € T, as 
needed. O 


We can informally summarize the previous result by saying that every functional digraph 
uniquely decomposes into disjoint rooted trees feeding into one or more disjoint cycles. There 
are two extreme cases of this decomposition that are especially interesting — the case where 
there are no trees (i.e., C = V), and the case where the whole digraph is a rooted tree (i.e., 
|C| = 1). We study these types of functional digraphs in the next two sections. 


DS 


3.6 Cycle Structure of Permutations 


The functional digraph of a bijection (permutation) has special structure, as we see in the 
next example. 


3.44. Example. Figure 3.6 displays the digraph associated to the following bijection: 


A(1)=7; h(2)=8; A) =4; h(4) =3; 
h(6)=2; h(7)=5; h(8)=6; h(9) =9; 
We see that the digraph for h contains only cycles; there are no trees feeding into these 
cycles. To see why this happens, compare this digraph to the digraph for the non-bijective 
function f in Figure 3.5. The digraph for f has a rooted tree feeding into the cyclic vertex 
15. Accordingly, f is not injective since f(17) = 15 = f(1). Similarly, if we move backward 
through the trees in the digraph of f, we reach vertices with indegree zero (namely 2, 7, 9, 
11, 13, and 14). The existence of such vertices shows that f is not surjective. Returning to 
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FIGURE 3.6 
Digraph associated to a permutation. 


the digraph of h, consider what happens if we reverse the direction of all the edges in the 
digraph. We obtain another functional digraph corresponding to the following function: 


h'(1)=10; h’(2)=6; h'(3)=4; A’(4)=3; h’(5) =7; 
h'(6)=8; hi(7)=1; A(8)=2; A'(9)=9; h/(10) =5. 


One sees immediately that h’ is the two-sided inverse for h, so that h is bijective. 
The next theorem explains the observations in the last example. 


3.45. Theorem: Cycle Decomposition of Permutations. Let f :V — V bea function 
with functional digraph G. The map f is a bijection iff every v € V is a cyclic vertex in V. 
In this situation, G is a disjoint union of cycles. 


Proof. Suppose u € V is a non-cyclic vertex. By Theorem 3.43, u belongs to a rooted tree 
S, whose root v belongs to a cycle of G. Following edges outward from u will eventually 
lead to v; let y be the vertex in $, just before v on this path. Let z be the vertex just 
before v in the unique cycle involving v. We have y 4 z, but f(y) = v = f(z). Thus, f is 
not injective. 

Conversely, suppose all vertices in V are cyclic. Then the digraph G is a disjoint union of 
directed cycles. So every v € V has indegree 1 as well as outdegree 1. Reversing the direction 
of every edge in G therefore produces another functional digraph G’. Let f’: V > V be the 
function associated to this new digraph. For all a,b € V, we have b = f(a) iff (a,b) € E(G) 
iff (b,a) € E(G’) iff a = f’(b). It follows that f’ is the two-sided inverse for f, so that f and 
f’ are both bijections. oO 


Recall that $(n, &), the Stirling number of the second kind, is the number of set partitions 
of an n-element set into k blocks. Let c(n, &) be the number of permutations of an n-element 
set whose functional digraph consists of k disjoint cycles. We will show that c(n,k) = 
s'(n,k), the signless Stirling number of the first kind. Recall from Recursion 2.58 that the 
numbers s’(n,k) satisfy s’(n,k) = s'(n-1,k—1)+ (n—-1)s’(n—-1,k) forO<k <n, with 
initial conditions s’(n,0) = 0 for n > 0 and s/(n,n) = 1 for n > 0. 


3.46. Theorem: Recursion for c(n,k). For 0 < k < n, we have 


c(n, k) =c(n —1,k —1) + (n— 1)c(n —1,k). 


The initial conditions are c(n,0) = 0 for n > 0 and c(n,n) = 1 for n > 0. Therefore, 
c(n,k) = s'(n,k) forO0<k <n. 


Proof. The identity map is the unique permutation of an n-element set with n cycles (which 
must each have length 1), so c(n,n) = 1 for n > 0. The only permutation with zero cycles 
is the empty function on the empty set, so c(n,0) = 0 for n > 0. Now suppose 0 < k < n. 
Let A, B,C be the sets of permutations counted by c(n,k), c(n— 1,4 —1), and c(n —1,k), 
respectively. Note that A is the disjoint union of the two sets 


A, ={f eA: f(n) =n} and AZ={f eA: f(n) Fn}. 
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For each f € Aj, we can restrict f to the domain {1,2,...,2—1} to obtain a permutation 
of these n — 1 elements. Since f has k cycles, one of which involves n alone, the restriction 
of f must have k — 1 cycles. Since f € Aj, is uniquely determined by its restriction to 
{1,2,...,n—1}, we have a bijection from A, onto B. 

On the other hand, let us build a typical element f € Az by making two choices. 
First, choose a permutation g € C in c(n — 1,k) ways. Second, choose an element i € 
{1,2,...,2—1} inn—1 ways. Let j be the unique number such that g(j) = 7. Modify the 
digraph for g by removing the arrow from j to i and replacing it by an arrow from j to 
nm and an arrow from n to 7. Informally, we are splicing n into the cycle just before 7. Let 
f be the permutation associated to the new digraph. Evidently, the splicing process does 
not change the number of cycles of g, and f satisfies f(n) 4 n. Thus, f € Ao, and every 
element of A» arises uniquely by the choice process we have described. By the Sum Rule 
and Product Rule, 


c(n, k) = |A| = |A1| + |Aa| = c(n — 1,4 —1) 4+ (n— 1)c(n —1,k). 


So c(n, k) and s’(n, k) satisfy the same recursion and initial conditions. A routine induction 
proof now shows that c(n,k) = s'(n,k) for all integers n,k > 0. Oo 


DS 


3.7 Counting Rooted Trees 


Our goal in this section is to count rooted trees (see Definition 3.42) with a fixed root vertex. 


3.47. The Rooted Tree Rule. For all n > 1, there are n”~? rooted trees on the vertex 
set {1,2,...,n} with root 1. 


Proof. Let B be the set of rooted trees mentioned in the theorem. Let A be the set of all 
functions f : {1,2,...,n} > {1,2,...,n} such that f(1) = 1 and f(n) =n. The Product 
Rule shows that |A| = n"~?. It therefore suffices to define maps ¢: A— Band ¢’:B—> A 
that are mutual inverses. To define ¢, fix f € A. Let Gy be the functional digraph associated 
with f, which has directed edges (i, f(é)) for 1 < i < n. By Theorem 3.43, we can decompose 
the vertex set {1,2,...,n} of this digraph into some disjoint cycles Co,Ci,...,C and 
(possibly) some trees feeding into these cycles. For 0 < i < k, let ¢; be the largest vertex in 
cycle C;, and write C; = (ri,...,€;). We can choose the indexing of the cycles so that the 
numbers £; satisfy fp > 01 > 2 >--+ > lx. Since f(1) = 1 and f(n) =n, 1 and n belong 
to cycles of length 1, so that 9 = rp = n, Co = (n), Ce = re = 1, Cy = (1), and k > 0. 
To obtain ¢(f), we modify the digraph Gy by removing all edges of the form (¢;,7r;) and 
adding new edges (¢;,r:+1), for 0 < i < k. It can be checked that ¢(f) is always a rooted 
tree with root 1. 


3.48. Example. Suppose n = 20 and f is the function defined as follows: 


fY=1; f2=19; f(3)=8 f4)=17; FS) =5, 
fGB=5 fM=4 F8)=3 Ff9)=6 fll0)=1, 
fl) = 18; fl2)=4; f(13)=18; fll4)= 20; f(15) = 15; 
f(16)=1; f(17)=12; f(18)=4; f(19)=20; (20) = 20. 


We draw the digraph of f in such a way that all vertices involved in cycles occur in a hori- 
zontal row at the bottom of the figure, and the largest element in each cycle is the rightmost 
element of its cycle. We arrange these cycles so that these largest elements decrease from 
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Conversion of the digraph to a rooted tree. 


left to right; in particular, vertex n is always at the far left, and vertex 1 at the far right. 
See Figure 3.7. To compute ¢(f), we cut the back-edges leading left from ¢; to r; (which 
are loops if 4; = r;) and add new edges leading right from ¢; to rj41. See Figure 3.8. 


Continuing the proof, let us see why ¢ is invertible. Let T be a rooted tree on {1,2,...,} 
with root 1. Following outgoing edges from n must eventually lead to the unique cyclic vertex 
1. Let P = (vo, v1,.-.,Us) be the vertices encountered on the way from vp = n to vs = 1. 
We recursively recover the numbers £0, €1,...,€% as follows. Let £9 = n. Define @; to be the 
largest number in P following 9. In general, after @;_1 has been found, define @; to be the 
largest number in P following ¢;_;. After finitely many steps, we will get €, = 1 for some 
k. Next, let ro = n, and for 2 > 0, let r; be the vertex immediately following @;-; on the 
path P. Modify T by deleting the edges (¢;,r;41) and adding edges of the form (¢;,7;), for 
0 <i<k. It can be verified that every vertex in the resulting digraph G’ has outdegree 
exactly 1, and there are loop edges in G’ at vertex 1 and vertex n. Thus, G’ is a functional 
digraph that determines a function f = ¢'(T) € A. It follows from the definition of ¢’ that 
¢’ is the two-sided inverse of ¢. oO 


3.49. Example. Suppose n = 9 and T is the rooted tree shown on the left in Figure 3.9. 
We first redraw the picture of T so that the vertices on the path P from n to 1 occur in 
a horizontal row at the bottom of the picture, with n on the left and 1 on the right. We 
recover £; and r; by the procedure above, and then delete the appropriate edges of JT and 
add appropriate back-edges to create cycles. The resulting functional digraph appears on 
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Conversion of a rooted tree to a functional digraph. 


the bottom right in Figure 3.9. So ¢’(T) is the function g defined as follows: 


g1)=1; g(2)=2; g(3)=2; g(4)=9; g(5)=9; 
g(6)=7; g(7)=6; 9(8)=9; 9(9)=9. 


DT 


3.8 Connectedness and Components 


In many applications of graphs, we need to know whether every vertex is reachable from 
every other vertex. 


3.50. Definition: Connectedness. Let G = (V,E,¢) be a graph or digraph. G is con- 
nected iff for all u,v € V, there is a walk in G from u to v. 


3.51. Example. The graph G, in Figure 3.1 is connected, but the simple graph G2 and 
the digraph G3 in that figure are not connected. 


Connectedness can also be described using paths instead of walks. 


3.52. Theorem: Walks and Paths. Let G = (V,E,c¢) be a graph or digraph, and let 
u,v € V. There is a walk in G from uw to v iff there is a path in G from u to v. 


Proof. Let W = (vo, €1,U1,---;@s,;Us) be a walk in G from u to v. We describe an algorithm 
to convert the walk W into a path from u to v. If all the vertices uv; are distinct, then the 
edges e; must also be distinct, so W is already a path. Otherwise, choose 7 minimal such 
that v; appears more than once in W, and then choose j7 maximal such that v; = v;. Then 
Wi = (v0, €1, U1, +++ Ci, Vis C741; Uj41; +++, €s, Us) is a walk from w to v of shorter length than 
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FIGURE 3.10 
Converting a walk to a path. 


W. If W, is a path, we are done. Otherwise, we repeat the argument to obtain a walk W2 
from u to v that is shorter than W,. Since the lengths keep decreasing, this process must 
eventually terminate with a path W;, from u to v. (W; has length zero if u = v.) The 
converse is immediate, since every path in G from uw to v is a walk in G from uw to v. O 


3.53. Example. In the simple graph shown in Figure 3.10, consider the walk 
W = (11,10,1,2,10,3, 4,8, 11,8, 12,10, 6,9, 7, 13,9, 12, 8,5). 
First, the repetition v9 = 11 = vg leads to the walk 
W, = (11,8, 12,10, 6,9, 7, 13,9, 12, 8,5). 
Eliminating the multiple visits to vertex 8 leads to the walk 
W, = (11,8, 5). 
W2 is a path from 11 to 5. 


3.54. Corollary: Connectedness and Paths. A graph or digraph G = (V, E,€) is con- 
nected iff for all u,v € V, there is at least one path in G from u to v. 


By looking at pictures of graphs, it becomes visually evident that any graph decomposes 
into a disjoint union of connected pieces, with no edge joining vertices in two separate pieces. 
These pieces are called the (connected) components of the graph. The situation for digraphs 
is more complicated, since there may exist directed edges between different components. To 
give a formal development of these ideas, we introduce the following equivalence relation. 


3.55. Definition: Interconnection Relation. Let G = (V,£,«) be a graph or digraph. 
Define a binary relation +g on the vertex set V by setting u 4 v iff there exist walks in 
G from u to v and from v to u. 


In the case of graphs, note that u 4@ w iff there is a walk in G from u to w, since the 
reversal of such a walk is a walk in G from w to u. Now, for a graph or digraph G, let us 
verify that <q is indeed an equivalence relation on V. First, for all u € V, (wu) is a walk of 
length 0 from u to u, sou 4g u, and 44 is reflexive on V. Second, the symmetry of 6¢ 
is automatic from the way we defined 6g: u 4@ v implies v 4g u for all u,v € V. Finally, 
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to check transitivity, suppose u,v,w € V satisfy u Og v and v Gg w. Let Wi, Wa, Ws, 
and W, be walks in G from u to v, from v to w, from v to u, and from w to v, respectively. 
Then the concatenation of W; followed by W2 is a walk in G from u to w, whereas the 
concatenation of W, followed by W3 is a walk in G from w to u. Hence u ¢@ w, as needed. 


3.56. Definition: Components. Let G = (V, E,¢) bea graph or digraph. The components 
of G are the equivalence classes of the interconnection equivalence relation ~¢@. Components 
are also called connected components or (in the case of digraphs) strong components. 


Since “q is an equivalence relation on V, the components of G form a set partition 
of the vertex set V. Given a component C' of G, consider the graph or digraph (C, E’, €’) 
obtained by retaining those edges in E with both endpoints in C and restricting € to this 
set of edges. One may check that this graph or digraph is connected. 


3.57. Example. The components of the graph G2 in Figure 3.1 are {0} and {1, 2, 3, 4, 5, 6}. 
The components of the digraph G3 in that figure are {1,2,4,5}, {3}, {6}, and {7}. 


The next theorems describe how the addition or deletion of an edge affects the compo- 
nents of a graph. 


3.58. Theorem: Edge Deletion and Components. Let G = (V, E,«) be a graph with 
components {C; : i € I}. Let e € E be an edge with endpoints v, w € C;. Let G’ = (V, E’, €’) 
where E’ = F—{e} and é’ is the restriction of € to E’. (a) If e appears in some cycle of G, 
then G and G’ have the same components. (b) If e appears in no cycle of G, then G’ has 
one more component than G. More precisely, the components of G’ are the Cy with k 4 9, 
together with two disjoint sets A and B such that AUB =Cj,v€ A, andw € B. 


Proof. For (a), let (vo, €1, U1, €2,---,Us) be a cycle of G containing e. Cyclically shifting and 
reversing the cycle if needed, we can assume vp = U = Us, €1 = €, and v; = w. Statement (a) 
will follow if we can show that the interconnection relations ~+¢ and <q coincide. First, 
for all y,z EV, y Gq z implies y 4c z since every walk in the smaller graph G’ is also a 
walk in G. On the other hand, does y +q@ z imply y “+q@ z? We know there is a walk W 
from y to z in G. If W does not use the edge e, W is a walk from y to z in G’. Otherwise, 
we can modify W as follows. Every time W goes from v = v, = vp to w = v1 Via e, replace 
this part of the walk by the sequence (vs, €s,---,€2,U1) obtained by taking a detour around 
the cycle. Make a similar modification each time W goes from w to v via e. This produces 
a walk in G’ from y to z. 

For (b), let us compute the equivalence classes of +c. First, fix z € Cy, where k # j. 
The set Cz consists of all vertices in V reachable from z by walks in G. It can be checked 
that none of these walks can use the edge e, so C; is also the set of all vertices in V reachable 
from z by walks in G’. So Cy is the equivalence class of z relative to both 4g and Gq. 

Next, let A and B be the equivalence classes of v and w (respectively) relative to Gq. 
By definition, A and B are two of the components of G’ (possibly the same component). We 
now show that A and B are disjoint and that their union is C;. If the equivalence classes A 
and B are not disjoint, then they must be equal. By Theorem 3.52, there must be a path 
(vo, €1,U1,---;€s,Us) in G’ from v to w. Appending e, vp to this path would produce a cycle 
in G involving the edge e, which is a contradiction. Thus A and B are disjoint. 

We now show that AU B C Cj. If z € A, then there is a walk in G’ (and hence in G) 
from v to z. Since C; is the equivalence class of uv relative to <+q, it follows that z € Cj. 
Similarly, z € B implies z € C; since C; is also the equivalence class of w relative to ¢@. 
Next, we check that C; C AUB. Let z € Cj, and let W = (wo, e1, wi,..., wz) be a walk in 
G from v to z. If W does not use the edge e, then z € A. If W does use e, then the portion 
of W following the last appearance of the edge e is a walk from either v or w to z in G’; 
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thus z € AU B. Since the union of A, B, and the C;, with k ¥ 7 is all of V, we have found 
all the components of G’. oO 


The previous result suggests the following terminology. 


3.59. Definition: Cut-Edges. An edge e in a graph G is a cut-edge iff e does not appear 
in any cycle of G. 


3.60. Theorem: Edge Addition and Components. Let G = (V, E,«) be a graph with 
components {C; : i € I}. Let Gt = (V, E*,e€*) be the graph obtained from G by adding a 
new edge e with endpoints v € C; and w € Cx. (a) If v and w are in the same component Cj 
of G, then e is involved in a cycle of Gt, and G and Gt have the same components. (b) If 
v and w are in different components of G, then e is a cut-edge of Gt, and the components 
of Gt are C; UC, and the C; with i j,k. 


This theorem follows readily from Theorem 3.58, so we omit the proof. 


DT 


3.9 Forests 


3.61. Definition: Forests. A forest is a graph with no cycles. Such a graph is also called 
acyclic. 


A forest cannot have any loops or multiple edges between the same two vertices. So we 
can assume, with no loss of generality, that forests are simple graphs. 


3.62. Example. Figure 3.11 displays a forest. 


1 10 


FIGURE 3.11 
A forest. 


Recall from Corollary 3.54 that a graph G is connected iff there exists at least one 
path between any two vertices of G. The next result gives an analogous characterization of 
forests. 


3.63. Theorem: Forests and Paths. A graph G is acyclic iff G has no loops and for all 
u,v in V(G), there is at most one path from u to v in G. 


Proof. We prove the contrapositive in both directions. First suppose that G has a cycle 
C = (v0, €1,U1,---,€s,Us). If s = 1, G has a loop. If s > 1, then (v1, €2,...,€s,Us) and 
(v1, €1, U0) are two distinct paths in G from v; to vo. 
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For the converse, we may assume G is simple since non-simple graphs must have cycles of 
length 1 or 2. The simple graph G has no loops, so we can assume that for some u,v € V(G), 
there exist two distinct paths P and Q from u to v in G. Among all such choices of u, v, 
P, and Q, choose one for which the path P has minimum length. Let P visit the sequence 
of vertices (%o,21,...,2s), and let Q visit the sequence of vertices (yo, yi,---,Yz), where 
Lp =U= yo, Ls = V = yz, and s is minimal. We must prove G has a cycle. 

First note that s > 0; otherwise u = v and Q would not be a path. Second, we assert that 
x1 # yi; otherwise, we would have two distinct paths P’ = (a1,...,%s5) and Q’ = (yi,.-., yz) 
from x; to v with P’ shorter than P, contradicting minimality of P. More generally, we 
claim that «; # y; for all i,j satisfying 1 <i <s and1< j < t. For if we had x; = y; 
for some 7,7 in this range, then P” = (xo,21,...,2;) and Q” = (yo, y1,---, yj) would be 
two paths from u to x; with P” shorter than P, and P” 4 Q” since x, 4 y:. This again 
contradicts minimality of P. Since 7, = v = y; and 2 = u = yo, it now follows that 


(£0, %1, -++5 Us, Yt—-1; Yt-2,---5Y1; Yo) 
is a cycle in G. O 
The following result gives a formula for the number of components in a forest. 


3.64. Theorem: Components of a Forest. Let G be a forest with n vertices and k 
edges. The number of connected components of G is n — k. 


Proof. We use induction on k. The result holds for k = 0, since G consists of n isolated 
vertices in this case. Assume that k > 0 and the result is already known for forests with n 
vertices and k — 1 edges. Given a forest G with n vertices and k edges, remove one edge e 
from G to get a new graph H. The graph H is acyclic and has n vertices and k — 1 edges. 
By induction, H has n — (k —1) = n —k+1 components. On the other hand, e must 
be a cut-edge since G has no cycles. It follows from Theorem 3.58 that H has one more 
component than G. Thus, G has n — k components, as needed. oO 


DS 


3.10 Trees 


3.65. Definition: Trees. A tree is a connected graph with no cycles. 


For example, Figure 3.12 displays a tree. 


FIGURE 3.12 
A tree. 
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Every component of a forest is a tree, so every forest is a disjoint union of trees. The 
next result is an immediate consequence of the Two-Leaf Lemma 3.37. 


3.66. Theorem: Trees Have Leaves. If T is a tree with more than one vertex, then T 
has at least two leaves. 


3.67. Definition: Pruning. Suppose G = (V, F) is a simple graph, vo is a leaf in G, and 
€o is the unique edge incident to the vertex vg. The graph obtained by pruning vo from G is 
the graph (V—{vo}, E—{eo}). 


3.68. Pruning Lemma. If T is an n-vertex tree, vp is a leaf of T, and T’ is obtained from 
T by pruning vo, then T” is a tree with n — 1 vertices. 


Proof. First, T has no cycles, and the deletion of vp and the associated edge eg will not 
create any cycles. So T” is acyclic. Second, let u,w € V(T’). There is a path from u to w 
in T. Since u 4 vo # w, this path will not use the edge eg or the vertex vo. Thus there is a 
path from u to w in T’, so T’ is connected. oO 


Figure 3.13 illustrates the Pruning Lemma. 


2G, 


. Vo a8 
delete to get G' 
FIGURE 3.13 

Pruning a leaf from a tree produces another tree. 


To give an application of pruning, we now prove a fundamental relationship between the 
number of vertices and edges in a tree (this also follows from Theorem 3.64). 


3.69. Theorem: Number of Edges in a Tree. If G is a tree with n > 0 vertices, then 
G has n — 1 edges. 


Proof. We use induction on n. If n = 1, then G must have 0 edges. Fix n > 1, and assume 
that the result holds for trees with n — 1 vertices. Let T be a tree with n vertices. We know 
that T has at least one leaf; let vg be one such leaf. Let T’ be the graph obtained by pruning 
vo from T. By the Pruning Lemma, T” is a tree with n — 1 vertices. By induction, T’ has 
n — 2 edges. Hence, T has n — 1 edges. oO 


3.70. Theorem: Characterizations of Trees. Let G be a graph with n vertices. The 
following conditions are logically equivalent: 

(a) G is a tree (i.e., G is connected and acyclic). 

(b) G is connected and has at most n — 1 edges. 

(c) G is acyclic and has at least n — 1 edges. 

(d) G has no loop edges, and for all u,v € V(G), there is a unique path in G from u to v. 
Moreover, when these conditions hold, G has exactly n — 1 edges. 
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Proof. First, (a) implies (b) and (a) implies (c) by Theorem 3.69. Second, (a) is equivalent 
to (d) by Corollary 3.54 and Theorem 3.63. Third, we prove (b) implies (a). Assume G is 
connected with k < n — 1 edges. If G has a cycle, delete one edge on some cycle of G. The 
resulting graph is still connected (by Theorem 3.58) and has k — 1 edges. Continue to delete 
edges in this way, one at a time, until there are no cycles. If we deleted i edges total, the 
resulting graph is a tree with k —i < n — 1-1 edges and n vertices. By Theorem 3.69, we 
must have 1 = 0 and k = n— 1. So no edges were deleted, which means that the original 
graph G is in fact a tree. 

Fourth, we prove (c) implies (a). Assume G is acyclic with k > n — 1 edges. If G is not 
connected, add an edge joining two distinct components of G. The resulting graph is still 
acyclic (by Theorem 3.60) and has k + 1 edges. Continue to add edges in this way, one at 
a time, until the graph becomes connected. If we added i edges total, the resulting graph 
is a tree with k+71 >n—1+7 edges and n vertices. By Theorem 3.69, we must have 7 = 0 
and k = n—1. So no edges were added, which means that the original graph G is in fact a 
tree. O 


a 
3.11 Counting Trees 

The next result, often attributed to Cayley, counts n-vertex trees. 

3.71. The Tree Rule. For all n > 1, there are n”~? trees with vertex set {1,2,...,n}. 


3.72. Example. Figure 3.14 displays all 44~? = 16 trees with vertex set {1, 2,3, 4}. 


1 1 1 1 1 1 2 2 2 2 3 3 
2 2 3 3 4 4 1 1 3 4 1 2 
3 4 2 4 2 3 3 4 1 1 2 1 
4 3 4 2 3 2 4 3 4 3 4 4 
2 1 2 2 
1 2 3 4 
3 4 3 4 1 4 3 1 
FIGURE 3.14 


The 16 trees with vertex set {1, 2,3, 4}. 


The Tree Rule 3.71 is an immediate consequence of the Rooted Tree Rule 3.47 and the 
following bijection. 


3.73. Theorem: Trees and Rooted Trees. Let V be a finite set and vp € V. There is a 
bijection from the set A of trees with vertex set V to the set B of rooted trees with vertex 
set V and root vo. 
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Proof. We define maps f : A—> B and g: B > A that are two-sided inverses. First, given 
T = (V,E) € A, construct f(T) = (V, E’) as follows. For each v € V with uv ¥ vo, there 
exists a unique path from v to vp in T. Letting e = {v,w} be the first edge on this path, 
we add the directed edge (v,w) to E’. Also, we add the loop edge (v9, vo) to E’. Since T 
has no cycles, the only possible cycle in the resulting functional digraph f(T) is the 1-cycle 
(vo). It follows that f(T’) is a rooted tree on V with root vo (see Definition 3.42). 

Next, given a rooted tree S € B, define g(S) by deleting the unique loop edge (vo, vo) 
and replacing every directed edge (v, w) by an undirected edge {v, w}. The resulting graph 
g(S'}) has n vertices and n — 1 edges. To see that g(S) is connected, fix y, z © V. Following 
outgoing edges from y (respectively z) in S produces a directed path from y (respectively 
z) to vo in S. In the undirected graph g(.S), we can concatenate the path from y to vo with 
the reverse of the path from z to vp to get a walk from y to z. It follows that g(S) is a tree. 

It is routine to check that go f = idy, since f assigns a certain orientation to each edge 
of the original tree, and this orientation is then forgotten by g. It is somewhat less routine 
to verify that f og = idg; we leave this as an exercise. The key point to be checked is that 
the edge orientations in f(g(S)) agree with the edge orientations in S, foreach Se B. O 


A different bijective proof of the Tree Rule, based on parking functions, is presented in 
812.5. We next prove a refinement of the Tree Rule that counts the number of trees such 
that each vertex has a specified degree. We give an algebraic proof first, and then convert 
this to a bijective proof in the next section. 


3.74. Theorem: Counting Trees with Specified Degrees. Suppose n > 2 and 
d,,...,dn > 0 are fixed integers. If dy + ---+d,n, = 2n — 2, then there are 


n—2 __ (n—2)! 
dy —1jdy—1j.c-ydn— 1) Jy (dy — 1)! 


trees with vertex set {v1, v2,...,Un} such that deg(v;) = d; for all 7. If di; +---+dn 4 2n—-2, 
then there are no such trees. 


Proof. The last statement holds because any tree T on n vertices has n — 1 edges, and thus 
SOL, deg(vi) = 2(n — 1). Assume henceforth that dj +--+: +d, = 2n — 2. We prove the 
result by induction on n. First consider the case n = 2. If d; = dz = 1, there is exactly one 
valid tree, and ( 2 od) = 1. For any other choice of d;, dz adding to 2, there are no valid 
trees, and fo) = 0. 

Now assume n > 2 and that the theorem is known to hold for trees with n — 1 vertices. 
Let A be the set of trees T with V(T) = {u1,...,Un} and deg(v;) = d; for all j. If d; = 0 for 
some j, then A is empty and the formula in the theorem is zero by convention. Now suppose 
d; > 0 for all 7. We must have d; = 1 for some 2, for otherwise dj +---+dy > 2n > 2n—2. 
Fix an 7 with d; = 1. Note that v; is a leaf in T for every T € A. Now, for each k F i 
between 1 and n, define 


A, ={T EA: {v;, un} € E(T)}. 


Ax is the set of trees in A in which the leaf vu; is attached to the vertex vz. A is the disjoint 
union of the sets Ax. 

Fix k # i. Pruning the leaf uv; gives a bijection between A; and the set By of all 
trees with vertex set {v1,...,Ui—-1, Viti,---,Un} such that deg(v;) = d; for 7 A i,k and 
deg(vz,) = dy, — 1. By induction hypothesis, 


(n — 3)! 


(dy — 2)! TTi<j<n(dj — 1)! 
ixik 


|Br| = 
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Therefore, by the Sum Rule and the Bijection Rule, 


n— 3)! 
4 = Dial = Sel =F a 


ki k#i k#i 
7 y I]; 4i(45 - 1)! y jai (4; —1)! 
(since (d; — 1)! = 0! = 1) 


n— 3)! 
= Taw 


g=1 ki 


Now, since di = 1, )?,44(de — 1) = )0¢-1 (de — 1) = (2n — 2) -n = n — 2. Inserting this 
into the previous formula, we see that 


(n — 2)! 
TTja1(4@3 — DY 


which completes the induction proof. O 


|A| = 


3.75. Corollary: Second Proof of the Tree Rule. Let us sum the previous formula 
over all possible degree sequences (d1,...,dn). Making the change of variables c; = d; — 1 
and invoking the Multinomial Theorem 2.11, we see that the total number of trees on this 
vertex set is 


n—2 n—2 
= 111% ...1o 
ha cs Cer eee s (oc 
d;> 


cite +en=n—-2 
ci>0 


ta 


_ (l+14---+1)*? =n", 


3.12 Pruning Maps 


We now develop a bijective proof of Theorem 3.74. Suppose n > 2 and dj,...,dn are 
positive integers that sum to 2n — 2. Let V = {v,...,un} be a vertex set consisting of 
n positive integers vy < vg < --+ < Un. Let A be the set of trees T with vertex set V 


such that deg(v;) = d; for all i. Let B be the set of words R(vt~'v$2~1..-vée—!) (see 
Definition 1.32). Each word w € B has length n — 2 and has d; — 1 copies of v;. To prove 
Theorem 3.74, it suffices to define a bijection f : A> B. 

Given a tree T € A, we compute f(T) by repeatedly pruning off the largest leaf of T, 
recording for each leaf the vertex adjacent to it in T. More formally, for 7 ranging from 
1 to n—1, let x be the largest leaf of T; define w; to be the unique neighbor of x in T; 
then modify T by pruning the leaf x. This process produces a word w,---wW,y_1; we define 
f(T) = wi-++Wn—2 (note that wp_1 is discarded). 


3.76. Example. Let T be the tree shown in Figure 3.15. To compute f(T’), we prune leaves 
from T in the following order: 8, 5, 4, 9, 6,3, 2,7. Recording the neighbors of these leaves, we 
see that w = f(T) = 9996727. Observe that the algorithm computes w,_1 = 1, but this 
letter is not part of the output word w. Also observe that w € R(1°2!3°495°61728°93) = 
R(1G-1...gdo-1), 
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FIGURE 3.15 
A tree with (d1,...,d9) = (1,2,1,1,1,2,3, 1,4). 


The observations in the last example hold in general. Given any T € A, repeatedly 
pruning leaves from T will produce a sequence of smaller graphs, which are all trees by the 
Pruning Lemma 3.68. By Theorem 3.66, each such tree (except the last tree) has at least 
two leaves, so vertex v, will never be chosen for pruning. In particular, v, is always the last 
vertex left, so that wn_1 is always v;. Furthermore, if v; is any vertex different from 11, 
then the number of occurrences of vj; in wiw2-+:Wn—1 is exactly d; — 1. For, every time 
a pruning operation removes an edge touching v;, we set w; = v; for some i, except when 
we are removing the last remaining edge touching v; (which occurs when v; has become 
the largest leaf and is being pruned). The same reasoning shows that v; (which never gets 
pruned) appears d; times in w1--+Wn—1. Since wn-1 = v1, every vertex v; occurs d; — 1 
times in the output word w1---wp_2. 

To see that f is a bijection, we use induction on the number of vertices. The result holds 
when n = 2, since in this case, A consists of a single tree with two nodes, and B consists 
of a single word (the empty word). Now suppose n > 2 and the maps f (defined for trees 
with fewer than n vertices) are already known to be bijections. Given w = w1---Wn—2 € B, 
we will show there exists exactly one T € A with f(T) = w. If such T exists, the leaves of 
T are precisely the vertices in V(T) that do not appear in w. Thus, the first leaf that gets 
pruned when computing f(T) must be the largest element z of V(T)—{w1,...,Wn—2}. By 
induction hypothesis, there exists exactly one tree T’ on the vertex set V(T)—{z} (with 
the appropriate vertex degrees) such that f(T’) = w2---Wny_2. This given, we will have 
f(T) = w iff T is the tree obtained from T’ by attaching a new leaf z as a neighbor of 
vertex w;. One readily confirms that this graph is in A (i.e., the graph is a tree with the 
correct vertex degrees). This completes the induction proof. The proof also yields a recursive 
algorithm for computing f~'(w). The key point is to use the letters not seen in w (and its 
suffixes) to determine the identity of the leaf that was pruned at each stage. 


3.77. Example. Given w = 6799297 and V = {1,2,...,9}, let us compute the tree f~!(w) 
with vertex set V. The leaves of this tree must be {1,3,4,5,8}, which are the elements of V 
not seen in w. Leaf 8 was pruned first and was adjacent to vertex 6. So now we must compute 
the tree f—~!(799297) with vertex set V—{8}. Here, leaf 6 was pruned first and was adjacent 
to vertex 7. Continuing in this way, we deduce that the leaves were pruned in the order 
8,6,5,4, 3, 2,9, 7; and the neighbors of these leaves (reading from w) were 6,7, 9,9, 2,9, 7, 1. 
Thus, f~'(w) is the tree shown in Figure 3.16. 
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FIGURE 3.16 
The tree computed by applying f~! to w = 6799297. 


3.13. Bipartite Graphs 


Suppose m people are applying for n jobs, and we want to assign each person to a job that 
he or she is qualified to perform. We can model this situation with a graph in which there 
are two kinds of vertices: a set A of m vertices representing the people, and a disjoint set B 
of n vertices representing the jobs. For all a € A and b € B, {a,b} is an edge in the graph iff 
person a is qualified to perform job 6. All edges in the graph join a vertex in A to a vertex 
in B. Graphs of this special type arise frequently; they are called bipartite graphs. 


3.78. Definition: Bipartite Graphs. A graph G is bipartite iff there exist two sets A 
and B such that ANB = 0, AUB = V(G), and every edge of G has one endpoint in A and 
one endpoint in B. In this case, the sets A and B are called partite sets for G. 


3.79. Example. The graph in Figure 3.3 is a bipartite graph with partite sets A = {1, 2,3} 
and B = {4,5,6}. This graph models the scenario where three people (labeled 1, 2, and 
3) are all qualified to perform three jobs (labeled 4, 5, and 6). The graph G in Figure 3.1 
is not bipartite due to the loop edge f at vertex 3. However, deleting this edge gives a 
bipartite graph with partite sets A = {1,3,5} and B = {2,4}. The tree in Figure 3.12 is 
bipartite with partite sets A = {1,7,8,9} and B = {2,3,4,5,6,10,11}. These sets are also 
partite sets for the forest in Figure 3.11, but here we could use other partite sets such as 
A’ = {1,5,6,9,10} and B’ = {2,3,4,7,8,11}. Thus, the partite sets of a bipartite graph 
are not unique in general. 


The next theorem gives a simple criterion for deciding whether a graph is bipartite. 


3.80. Theorem: Bipartite Graphs. A graph G is bipartite iff G has no cycle of odd 
length iff G has no closed walk of odd length. 


Proof. First assume G is a bipartite graph with partite sets A and B. Let 
(vo, €1, U1, €2, V2,---,€s,Us) be any cycle in G; we must show that s is even. By switch- 
ing A and B if needed, we may assume vo € A. Since e; is an edge from vo to v1, we must 
have v; € B. Since eg is an edge from v, to v2, we must have v2 € A. It follows by induction 
that for 0 <i<s,v; € A iff i is even, and v; € B iff 7 is odd. Since v, = vp is in A, we see 
that s must be even, as needed. 

Second, we prove that if G has no cycle of odd length, then G has no closed walk 
of odd length. Using proof by contrapositive, assume there exists a closed walk of odd 
length in G; we prove there must exist a cycle of odd length in G. Choose a closed walk 
(vp, €1,U1,---;€s,Us) of odd length with s as small as possible; we claim this walk must be 
a cycle. Otherwise, there would be an index i with 0 <i < s and vp = vu; = vs. If 7 is odd, 
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then (vo,€1,U1,---,€:,0i) is a closed walk of odd length i < s, contradicting minimality 
of s. If i is even, then (vj, ei41, Vid1,---,;@s,Us) is a closed walk of odd length s —i < s, 
contradicting minimality of s. Thus the original walk was already a cycle. 

Third, we prove that if G has no closed walk of odd length, then G is bipartite. We 
begin with the special case where G is connected. Fix a vertex uo € V(G). Let A be the 
set of v € V(G) such that there is a walk of even length in G from vo to v; let B be the 
set of w € V(G) such that there is a walk of odd length in G from vo to w. Since G is 
connected, every vertex of G is reachable from vp, so V(G) = AU B. We claim AN B= 0. 
For if v € AN B, there are walks W, and W2 from vo to v where W, has even length and 
W, has odd length. The concatenation of W, with the reversal of W2 would give a closed 
walk of odd length in G, contrary to our hypothesis on G. Next, we claim every edge of G 
must join a vertex in A to a vertex in B. For if {u,v} is an edge where u,v € A, then we 
get a closed walk of odd length by taking an even-length walk from vp to u, followed by 
the edge {u,v}, followed by an even-length walk from v to vo. Similarly, we cannot have an 
edge {u,v} with u,v € B. We have now proved G is bipartite with partite sets A and B. 

To treat the general case, let G have components C,...,Cim. Since G has no closed 
walks of odd length, the same is true of each component. Thus, the argument in the last 
paragraph applies to each component, providing disjoint sets A;, B; such that C; = A; U B; 
and every edge with endpoints in C; goes from a vertex in A; to a vertex in B;. It follows 
that A= A, U---UA, and B = B, U---UB,, are partite sets for G, so G is bipartite. O 


3.81. Example. The simple graph G2 in Figure 3.1 is not bipartite because (1,2,6,1) isa 
cycle in G2 of length 3. 


(NR 


3.14 Matchings and Vertex Covers 


Consider again a job assignment graph where there is a set A of vertices representing 
people, a disjoint set B of vertices representing jobs, and an edge {a,b} whenever person 
a is qualified for job b. We can model an assignment of people to jobs by a subset M of 
the edge set in which no two edges in M have a common endpoint. This condition means 
that each person can perform at most one of the jobs, and each job can be filled by at most 
one person. Ideally, we would like to choose M so that every person gets a job. If that is 
impossible, we would like to choose M as large as possible. This discussion motivates the 
following definition, which applies to any (not necessarily bipartite) graph. 


3.82. Definition: Matchings. A matching of a graph G is a set M of edges of G such 
that no two edges in M share an endpoint. Let m(G) be the maximum number of edges in 
any matching of G. 


3.83. Example. The graph G in Figure 3.10 has a matching 


M = {{1, 2}, {3, 10}, {4, 8}, {9, 12}, {6, 13}, {5, 7}} 


of size 6. There can be no larger matching, since the graph has only 13 vertices. Thus, 
m(G) = 6. 


3.84. Example. Consider the path graph P, with vertex set {1,2,...,n} and edge set 
{{t,7+1} : 1 <i <n}. Figure 3.17 shows the path graphs Ps; and Ps. We can write n = 2k or 
n = 2k+1 for some integer k > 0. In either case, the set M = {{1, 2}, {3, 4}, {5, 6},..., {2k- 
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FIGURE 3.17 


Path graphs and cycle graphs. 


1,2k}} is a maximum matching of P,, so m(P,,) = k = |n/2]. Similarly, consider the cycle 
graph C,, obtained from P,, by adding the edge {1,n}. The matching M is also a maximum 
matching of C;,, so m(C,) = k = |n/2]. 


The next definition introduces a new optimization problem that is closely related to the 
problem of finding a maximum matching in a graph. 


3.85. Definition: Vertex Covers. A vertex cover of a graph G is a subset C of the vertex 
set of G such that every edge of G has at least one of its endpoints in C. Let vc(G) be the 
minimum number of vertices in any vertex cover of G. 


3.86. Example. The graph G in Figure 3.10 has vertex cover C = {1,4,7,8,9, 10, 13}. For 
n = 2k or n = 2k 41, the path graph P, has vertex cover D = {2,4,6,...,2k} of size 
k = |n/2]. When n = 2k is even, D is also a vertex cover of the cycle graph C,. When 
n = 2k +1 is odd, D is not a vertex cover of C,,, but DU {1} is. 


In the previous example, some additional argument is needed to determine whether 
we have found vertex covers of the minimum possible size. Similarly, in most cases, it is 
not immediately evident that a given matching of a graph has the maximum possible size. 
However, the next lemma shows that if we are able to build a matching M of G and a vertex 
cover C' of G having the same size, then MM must be a maximum matching and C' must be 
a minimum vertex cover, so that m(G) = |M| = |C| = uc(G). 


3.87. Lemma: Matchings and Vertex Covers. Let G be a graph with matching M 
and vertex cover C. (a) |M| < |C|, and hence m(G) < vc(G). (b) If |M| = |C], then 
m(G) = |M| = |C| = vc(G), and every vertex in C' is an endpoint of some edge in M. 


Proof. (a) We define a function f : M — C as follows. Given an edge e € M, let v, w be the 
endpoints of this edge. Since Cis a vertex cover, at least one of v or w must belong to C’. Let 
f(e) =vifv € C, and f(e) = w otherwise. We claim f is one-to-one. For suppose e, e’ € M 
satisfy f(e) = f(e’). Then e and e’ share a common endpoint f(e) = f(e’). Because M is 
a matching, this forces e = e’. So f is one-to-one, and hence |M| < |C|. Taking M to be 
a maximum matching and C to be a minimum vertex cover in this result, we deduce that 
m(G) < vc(G). 

(b) Suppose |M| = |C|. For any matching M’ of G, part (a) shows that |M"| < |C| = |M|, 
so M is a maximum matching of G. For any vertex cover C’ of G, part (a) shows that 
|C’| > |M| = |C|, so C is a minimum vertex cover of G. Thus, m(G) = |M| = |C| = vc(G) 
in this situation. Moreover, because |M| = |C|, the injective map f : M — C constructed 
in part (a) must be bijective by Theorem 1.77. Surjectivity of f means that for any vertex 
v € C, there is an edge e € M with f(e) = v. By definition of f, this means that each 
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vertex in the minimum vertex cover C' is an endpoint of some edge in the maximum matching 
M. O 


3.88. Example. For the path graph P,,, we found a matching M and a vertex cover C' of 
the same size. By the lemma, M is a maximum matching, C' is a minimum vertex cover, 
and m(P,) = vc(Pp,) = |n/2]. Similarly, for n even, we have m(C,) = vc(C,) = n/2. But, 
it can be checked that the odd cycle Cs has m(Cs) = 2 and uc(Cs) = 3. So, for general 
graphs G,, m(G) < uvc(G) can occur. We prove in the next section that for a bipartite graph 
G, m(G) and vc(G) are always equal. 


TT 


3.15 Two Matching Theorems 
3.89. The K6nig—Egervary Theorem. For all bipartite graphs G, m(G) = vc(G). 


Proof. Let G have vertex set V and edge set E. We proceed by strong induction on |V|+|£]. 
Fix a bipartite graph G with N = |V| + |£|, and assume that for any bipartite graph G’ 
with |V(G’)| + |E(G’)| < N, m(G"’) = vc(G’). We will prove m(G) = vc(G) by considering 
various cases. Case 1: G is isomorphic to a path graph P,,. The examples in the last section 
show that m(G) = |n/2| = vc(G) in this case. Case 2: G is isomorphic to a cycle graph 
C,. Since bipartite graphs contain no odd cycles, n must be even, and we saw earlier that 
m(G) = n/2 = vc(G) in this case. 

Case 3: G is not simple. Now G has no loop edges (being bipartite), so there must 
exist multiple edges between some pair of vertices in G. Let G’ be the bipartite graph 
obtained from G by deleting one of these multiple edges. By the induction hypothesis, 
m(G"’) = vc(G’). On the other hand, we see from the definitions that m(G) = m(G’) and 
vc(G) = vc(G’), so m(G) = vc(G). 

Case 4: G is not connected. Let Ky,..., Km be the components of G. Form graphs 
Gi,...,Gm by letting G; have vertex set K; and edge set consisting of all edges in E(G) 
with both endpoints in K;. Since G is not connected, m > 1, so the induction hypothesis 
applies to the smaller graphs G1,...,Gm (which are still bipartite). Let M; be a maximum 
matching of G;, and let C; be a minimum vertex cover of G; for each 7; then |M;| = m(G;) = 
ve(G;) = |C;| for all 7. One immediately verifies that M = Ui, M; is a matching of G of 
size 0." ,|M;|, and C = Uj; C; is a vertex cover of G of size 7", |C;|. Since |M| = |C|, 
the Lemma on Matchings and Vertex Covers assures us that m(G) = |M| = |C| = vc(G), 
as needed. 

Case 5: G is simple and connected, but G is not isomorphic to a path P,, or a cycle Cy. 
It readily follows that there must exist a vertex u in G with deg(u) > 3. Let v be a fixed 
vertex of G adjacent to u. Case 5a: Every maximum matching of G contains an edge with 
endpoint v. Let G’ be the graph obtained from G by deleting vertex v and all edges having 
v as an endpoint. G’” is still bipartite, and the induction hypothesis applies to show that 
m(G’) = vc(G"). Let C’ be a fixed minimum vertex cover of G’. Then C = C’U{v} is a vertex 
cover of G, so uc(G) < |C| = |C’| +. 1 = vc(G’) + 1 = m(G’) + 1. On the other hand, any 
maximum matching M’ of G’ is also a matching of G, but it cannot be a maximum matching 
of G by the assumption of Case 5a. So m(G’) < m(G), hence vc(G) < m(G’) + 1 < m(G). 
We already know m(G) < uc(G), so m(G) = vc(G) follows. 

Case 5b: There exists a maximum matching M of G such that no edge of M has endpoint 
v. Recall deg(u) > 3 and G is simple. Of the edges touching wu, one leads to v, and at most 
one other edge appears in the matching M. Thus there must exist an edge f ¢ M such that 
u is an endpoint of f but v is not. Let G’ be the graph obtained from G by deleting the edge 
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f. G’ is still bipartite, and the induction hypothesis applies to show that m(G’) = uc(G"). 
Because M did not use the edge f, M is a maximum matching of G’ as well as G, so 
m(G) = m(G’). Let C’ be a minimum vertex cover of G’. It suffices to show C’ is also a 
vertex cover of G, for then G will have a matching and a vertex cover of the same size. Every 
edge of G except possibly f has an endpoint in C’. Also, since |M| = |C’|, the Lemma on 
Matchings and Vertex Covers (applied to the graph G’) tells us that every vertex in C” is 
the endpoint of some edge in M. By the assumption in Case 5b, this forces v ¢ C’. But the 
edge from u to v is in G’, so u € C” by definition of a vertex cover. Since u is an endpoint 
of the edge f, we see that C’ is a vertex cover of G, as needed. O 


The next theorem will use the following notation. Given a set S' of vertices in a graph 
G, a vertex v is a neighbor of S' iff there exists w € S and an edge e in G with endpoints v 
and w. Let N(S) be the set of all neighbors of S. Given a matching M of G and a set A of 
vertices of G, we say the matching saturates A iff every vertex in A is the endpoint of some 
edge in M. 


3.90. Hall’s Matching Theorem. Let G be a bipartite graph with partite sets A and B. 
There exists a matching of G saturating A iff for all S C A, |S| < |N(S)]. 


Proof. First assume there is a matching M of G saturating A. Fix S C A. Define a function 
f : S — N(S) as follows. For each v € S, there exists an edge e in M having v as one 
endpoint (since M saturates A), and this edge is unique (since M is a matching). Let f(v) 
be the other endpoint of edge e, which lies in N(S). Now f is one-to-one: if f(v) = f(v’) 
for some v,v’ € S, then the edge from v to f(v) and the edge from v’ to f(v') = f(v) both 
appear in M, forcing v = v’ because M is a matching. So |S| < |N(S)|. 

Conversely, assume || < |N(S)| for all S C A. Let C be an arbitrary vertex cover 
of G. Let Ay = ANC, Ap = A-C, and By = BNC. Every edge with an endpoint in 
Az must have its other endpoint in B,, since C is a vertex cover of G. This means that 
N(A2) C B,. Taking S = A» in the assumption, we see that |A2| < |N(A2)| < |Bi|. Then 
|C| = |Ai|+|Bi] > |A1|+]A2| = |A]. On the other hand, the set A is a vertex cover of G since 
G is bipartite with partite sets A and B. Since |A| < |C| holds for every vertex cover C’, we 
see that A is a minimum vertex cover of G. By the previous theorem, m(G) = vc(G) = |A|. 
So there is a matching M of G consisting of |A| edges. Each of these edges has one endpoint 
in A, and these endpoints are distinct because M is a matching. Thus, M is a matching 
that saturates A. O 


DS 


3.16 Graph Coloring 


This section introduces the graph coloring problem and some of its applications. 


3.91. Definition: Colorings. Let G = (V, E) be a simple graph, and let C be a finite set. 
A coloring of G using colors in C is a function f : V > C. A coloring f of G is a proper 
coloring iff for every edge {u,v} € E, f(u) 4 f(v). 


Intuitively, we are coloring each vertex of G using one of the available colors in the set 
C. For each v € V, f(v) is the color assigned to vertex v. A coloring is proper iff no two 
adjacent vertices in G are assigned the same color. 


3.92. Definition: Chromatic Functions and Chromatic Numbers. Let G be a simple 
graph. For each positive integer x, let yq(x) be the number of proper colorings of G using 
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colors in {1,2,...,2}. The function yg : Zy9 — Zso is called the chromatic function of G. 
The minimal x such that yg(x) > 0 is called the chromatic number of G. 


The chromatic number is the least number of colors required to obtain a proper coloring 
of G. The function yg is often called the chromatic polynomial of G because of Corollary 3.99 
below. 


3.93. Example. Suppose G is a simple graph with n vertices and no edges. By the Product 
Rule, x¢(a) = x” since we can assign any of the x colors to each vertex. The chromatic 
number for this graph is 1. 


3.94. Example. At the other extreme, suppose G is a simple graph with n vertices such 
that there is an edge joining every pair of distinct vertices. Color the vertices one at a time. 
The first vertex can be colored in x ways. The second vertex must have a color different 
from the first, so there are x—1 choices. In general, the 7th vertex must have a color distinct 
from all of its predecessors, so there are x — (i — 1) choices for the color of this vertex. The 
Product Rule gives yq(a) = x(a —1)(a — 2)---(a@—n+1) = (a) ,. The chromatic number 
for this graph is n. Recall from §2.15 that 


n 


(t)n= >) s(n, k)a*, 


k=1 


so that the function yg in this example is a polynomial whose coefficients are the signed 
Stirling numbers of the first kind. 


3.95. Example: Cycles. Consider the simple graph 


G = ({1, 2,3, 4}, {{1, 2}, {2, 3}, (3, 4}, {4, 1}}). 


G consists of four vertices joined in a 4-cycle. We might attempt to compute yqg(x) via 
the Product Rule as follows. Color vertex 1 in « ways. Then color vertex 2 in x — 1 ways, 
and color vertex 3 in « — 1 ways. We run into trouble at vertex 4, because we do not know 
whether vertices 1 and 3 were assigned the same color. This example shows that we cannot 
always compute vq by the Product Rule alone. In this instance, we can classify proper 
colorings based on whether vertices 1 and 3 receive the same or different colors. If they 
receive the same color, the number of proper colorings is x(a — 1)(a — 1) (color vertices 1 
and 3 together, then color vertex 2 a different color, then color vertex 4 a different color 
from 1 and 3). If vertex 1 and 3 receive different colors, the number of proper colorings is 
x(a — 1)(a — 2)(a — 2) (color vertex 1, then vertex 3, then vertex 2, then vertex 4). Hence 


xXa(x) = «(a — 1)(x — 1) + a(x — 1)(2 — 2)(2 — 2) = c* — 4° + 62? — 3x. 


The chromatic number for this graph is 2. 

More generally, consider the cycle graph C,, consisting of n vertices joined in a cycle. It 
is routine to establish that the chromatic number of C,, is 1 for n = 1, is 2 for all even n, 
and is 3 for all odd n > 1. On the other hand, it is not immediately evident how to compute 
the chromatic function for C;, when n > 4. We will deduce a recursion for these functions 
shortly as a special case of a general recursion for computing chromatic functions. 


Here is an application that can be analyzed using graph colorings and chromatic num- 
bers. Suppose we are trying to schedule meetings for a number of committees. If two com- 
mittees share a common member, they cannot meet at the same time. Consider the graph 
G whose vertices represent the various committees, and where there is an edge between two 
vertices iff the corresponding committees share a common member. Suppose there are x 
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available time slots in which meetings may be scheduled. A coloring of G with x colors rep- 
resents a particular scheduling of committee meetings to time slots. The coloring is proper 
iff the schedule creates no time conflicts for any committee member. The chromatic number 
is the least number of time slots needed to avoid all conflicts, while y¢(x) is the number of 
different conflict-free schedules using x (distinguishable) time slots. 


3.96. Example. Six committees have members as specified in Table 3.1. 


Committee | Members 

Kemp, Oakley, Saunders 
Gray, Saunders, Russell 
Byrd, Oakley, Quinn 
Byrd, Jenkins, Kemp 
Adams, Jenkins, Wilson 
Byrd, Gray, Russell 


TABLE 3.1 
Committee assignments in Example 3.96. 


Figure 3.18 displays the graph G associated to this set of committees. To compute yq(z), 
consider cases based on whether vertices A and F receive the same color. If A and F are 
colored the same, the number of proper colorings is (a — 1)(a# — 2)(a — 1)(a — 1) [color 
A and F, then C, D, B, and E]. If A and F receive different colors, the number of proper 
colorings is x(x — 1)(% — 2)(a — 3)(a — 2)(a — 1) [color A, F, C, D, B, E]. Thus, 


xXo(ax) = a(a—1)?(e2 — 2)(a —1+4+ (a — 2)(a —3)) = 2° — 8a? + 260* — 4257 + 33n7 — 10. 


The chromatic number of G is 3. 


FIGURE 3.18 
Conflict graph for six committees. 


We are about to present a general recursion that can be used to compute the chromatic 
function of a simple graph. The recursion makes use of the following construction. 


3.97. Definition: Collapsing an Edge. Let G = (V,£) be a simple graph, and let 
€o = {vo, Wo} be a fixed edge of G. Let zo be a new vertex. We define a simple graph 
HT called the graph obtained from G by collapsing the edge eo. ‘The vertex set of H is 
(V—{vo, wo}) U {zo}. The edge set of H is 


{{x,y}: 2 Avo #y and xc # wo # y and {x,y} € E} 
U{{x, zo}: « A vo and {x, wo} € E} 
Uf{{a, zo} : « A wo and {x, vo} € EF}. 
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Pictorially, we construct H from G by shrinking the edge ep until the vertices vg and wo 


coincide. We replace these two overlapping vertices with a single new vertex zo. All edges 
touching vp or wo (except the collapsed edge eg) now touch zp instead. See Figure 3.19. 


original graph graph after collapsing & 


FIGURE 3.19 
Collapsing an edge in a simple graph. 


3.98. Theorem: Chromatic Recursion. Let G = (V,E) be a simple graph. Fix any 
edge e = {v,w} € G. Let G’ = (V, E—{e}) be the simple graph obtained by deleting the 
edge e from G, and let G” be the simple graph obtained from G by collapsing the edge e. 
Then 

X@(@) = xe'(x) — xen (2). 


Proof. Fix x € Zso, and let A, B, and C denote the set of proper colorings of G, G’, and 
G” (respectively) using x available colors. Write B = B, U Bo, where Bi = {f EB: f(v) = 
f(w)} and B2 ={f © B: f(v) F f(w)}. Note that B, consists of the proper colorings of 
G’ (if any) in which vertices v and w are assigned the same color. Let z be the new vertex 
in G” that replaces v and w. Given a proper coloring f € By, we define a corresponding 
coloring f” of G” by setting f”(z) = f(v) = f(w) and f”(u) = f(u) for all u € V different 
from v and w. Since f is proper, it follows from the definition of the edge set of G” that 
f” is a proper coloring as well. Thus we have a map f +> f” from B, to C. This map is 
invertible, since the color of z in a coloring of G” determines the common color of v and w 
in a coloring of G’ belonging to B,. We conclude that |Bi| = |C}. 

On the other hand, Bz consists of the proper colorings of G’ in which vertices v and w 
are assigned different colors. These are precisely the proper colorings of G (since G has an 
edge between v and w, and G is otherwise identical to G’). Thus, By = A. It follows that 


xa() = [A] = |Bo| = |B] — |Bi| = |B] — IC] = xer(@) — xan(2). O 


3.99. Corollary: Polynomiality of Chromatic Functions. For any graph G, xc(«) 
is a polynomial in x with integer coefficients. (This justifies the terminology chromatic 
polynomial.) 


Proof. We use induction on the number of edges in G. If G has k vertices and no edges, 
the Product Rule gives yq(x) = x", which is a polynomial in x. Now assume G has m > 0 
edges. Fix such an edge e, and define G’ and G” as in the preceding theorem. G’ has one 
fewer edge than G. When passing from G to G”’, we lose the edge e and possibly identify 
other edges in G (e.g., if both endpoints of e are adjacent to a third vertex). In any case, 
G” has fewer edges than G. By induction on m, we may assume that both yq@(x) and 
xe" (x) are polynomials in x with integer coefficients. So yx¢(x) = xq (x) — xq (x) is also 
a polynomial with integer coefficients. 
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FIGURE 3.20 
Using the chromatic recursion. 


3.100. Remark. We can use the chromatic recursion to compute xq recursively for any 
graph G. The base case of the calculation is a graph with k vertices and no edges, which 
has chromatic polynomial «*. If G has more than one edge, G’ and G” both have strictly 
fewer edges than G. Thus, the recursive calculation will terminate after finitely many steps. 
However, this is quite an inefficient method for computing yg if G has many vertices and 
edges. Thus, direct counting arguments using the Sum Rule and Product Rule may be 
preferable to repeatedly applying the chromatic recursion. 


3.101. Example. Consider the graph G shown on the left in Figure 3.20. We compute 
xvae(x) by applying the chromatic recursion to the edge e = {d,h}. The graphs G’ and G” 
obtained by deleting and collapsing this edge are shown on the right in Figure 3.20. Direct 
arguments using the Product Rule show that 


xq (x) = a(x — 1)(a — 2)(a — 2)(a — 1)(x — 1) (color a, c, d, f, b, h); 


xan (a) = «(a — 1)(a% — 2)(a — 2)(a — 2) (color z, a, c, f, b). 


Therefore, 


xe(z) = x(a — 1)(x — 2)?((a — 1)? — (2 — 2)) = &® — 82° + 260+ — 430° + 36x? — 122. 


3.102. Chromatic Polynomials for Cycles. For each n > 3, let C, denote a graph 
consisting of n vertices joined in a cycle. Let C, denote a one-vertex graph, and let C2 
denote a graph with two vertices joined by an edge. Finally, let yn(%) = xc,,(a) be the 
chromatic polynomials for these graphs. We see directly that 


xv1(x) = 2, X2(x) = a(x — 1) = 2? — 2, x3(x) = a(2 —1\(a— 2) = o? — 3a? + Qe. 


Fix n > 3 and fix any edge e in C,,. Deleting this edge leaves a graph in which n vertices 
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are joined in a line; the chromatic polynomial of such a graph is x(x — 1)"~+. On the other 
hand, collapsing the edge e in C, produces a graph isomorphic to C,_;. The chromatic 
recursion therefore gives 


Xn(a) = (ae — 1)"~* — xn-1(2). 
Using this recursion to compute y,,(#) for small n suggests the closed formula 


Xn(x) = (a — 1)" + (-1)"(a — 1) for all n > 2. 


One may now prove this formula for y,(a) by induction, using the chromatic recursion. 


ne 


3.17 Spanning Trees 


This section introduces the notion of a spanning tree for a graph. A recursion resembling 
the Chromatic Recursion 3.98 will allow us to count the spanning trees for a given graph. 
This will lead to a remarkable formula, called the Matrix-Tree Theorem, that expresses the 
number of spanning trees as a certain determinant. 


3.103. Definition: Subgraphs. Let G = (V,E,¢) and H = (W,F,7) be graphs or di- 
graphs. H is a subgraph of G iff W CV, F C E, and n(f) = e(f) for all f € F. A is an 
induced subgraph of G iff H is a subgraph such that F' consists of all edges in FE with both 
endpoints in W. 


3.104. Definition: Spanning Trees. Given a graph G = (V,E,«¢), a spanning tree for 
G is a subgraph H with vertex set V such that H is a tree. Let T(G) be the number of 
spanning trees of G. 


3.105. Example. Consider the graph G shown in Figure 3.21. This graph has 31 spanning 


« 8 £ 2 
ot — 9 ae 
a||b eA 
1 q 4 


FIGURE 3.21 
Graph used to illustrate spanning trees. 


trees, which are specified by the following sets of edges: 


{a,c,d,f}, {b,cd,f}, {a,c¢d,g}, {b,cd,g}, f{a,c,d,h}, {b,c,d,h}, 
{c,d,e,f}, {ed,e,g}, {ed,e,h}, {a,ce,f}, f{a,de,f}, {b,c e, f}, 
{b,d,e,f}, {a,c,e,g}, {a,d,e,g}, {b,c,e,g}, {b,d,e,g}, {a,c e,h}, 
{a,d,e,h}, {b,c,e,h}, {b,d,e,h}, {a,c f,h}, f{a,d,f,h}, {b,c f,h}, 
{b, d, f, h}, {ayt,9; h}, {a,d,g,h}, {8,¢, 9, h}, {b, d, g, h}, 16; d, f, h}, 
{c,d,g,h}. 


We see that even a small graph can have many spanning trees. Thus we seek a systematic 
method for enumerating these trees. 
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We are going to derive a recursion involving the quantities 7(G). For this purpose, we 
need to adapt the ideas of deleting an edge and collapsing an edge (see Definition 3.97) from 
simple graphs to general graphs. Since loop edges are never involved in spanning trees, we 
only consider graphs without loops. Suppose we are given a graph G = (V, E,¢) and a fixed 
edge z € E with endpoints u,v € V. To delete z from G, we replace E by E’ = E—{z} and 
replace € by the restriction of € to E’. To collapse the edge z, we act as follows: (i) delete z 
and any other edges linking u to v; (ii) replace V by (V—{u, v}) U {w}, where w is a new 
vertex; (iii) for each edge y € E that has exactly one endpoint in the set {u, v}, modify e(y) 
by replacing this endpoint with the new vertex w. 


3.106. Example. Let G be the graph shown in Figure 3.21. Figure 3.22 displays the graphs 
obtained from G by deleting edge f and collapsing edge f. 


delete edge f: collapse edge f: 
0 2 
3 ——o__ 9 3 c 5 
ae °£).A, al|b {ih 
1 4 
d | d 4 


FIGURE 3.22 
Effect of deleting or collapsing an edge. 


3.107. Theorem: Spanning Tree Recursion. Let G = (V,£E,«) be a graph, and let 
z € E be a fixed edge. Let Gp be the graph obtained from G by deleting z. Let G; be the 
graph obtained from G by collapsing z. Then 


7(G) = 7(Go) + 7(G1). 


The initial conditions are: 7(G) = 0 if G is not connected, and 7(G) = 1 if G is a tree with 
vertex set V. 


Proof. For every graph K, let Sp(K) be the set of all spanning trees of K, so r(K) = 
|Sp(K)|. Fix the graph G and the edge z. Let X be the set of trees in Sp(G) that do not 
use the edge z, and let Y be the set of trees in Sp(G) that do use the edge z. Sp(G) is 
the disjoint union of X and Y, so r(G) = |X|+|Y| by the Sum Rule. Now, it follows from 
the definition of edge deletion that the set X is precisely the set Sp(Go), so |X| = T(Go). 
To complete the proof, we need to show that |Y| = 7(G1). It suffices to define a bijection 
F:Y> Sp(G1). 

Suppose T € Y is a spanning tree of G that uses the edge z with endpoints u,v. Define 
F(T) to be the graph obtained from T by collapsing the edge z; this graph is a subgraph 
of G;. Let n be the number of vertices of G; then T is a connected graph with n — 1 edges, 
one of which is z. It is routine to check that F(T) is still connected. Furthermore, since T is 
a tree, z is the only edge in T between u and v. It follows from the definition of collapsing 
that F(T) has exactly n — 2 edges. Since G; has n — 1 vertices, it follows that F(T) is a 
spanning tree of G;. We see also that the edge set of F(T) is precisely the edge set of T 
with z removed. So far, we have shown that F is a well-defined function mapping Y into 
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Auxiliary graphs used in the computation of 7(G). 


Next we define a map H : Sp(Gi) — Y that is the two-sided inverse of F'. Given 
U € Sp(G1) with edge set E(U), let H(U) be the unique subgraph of G with vertex set V 
and edge set E(U) U{z}. We must check that H(U) does lie in the claimed codomain Y. 
First, H(U) is a subgraph of G with n — 1 edges, one of which is the edge z. Furthermore, 
it can be checked that H(U) is connected, since walks in U can be expanded using the edge 
z if needed to give walks in H(U). Therefore, H(U) is a spanning tree of G using z, and 
so H(U) € Y. Since F' removes z from the edge set while H adds it back, F and H are 
two-sided inverses of each other. Hence both are bijections, and the proof is complete. O 


3.108. Example. We use the graphs in Figures 3.21 and 3.22 to illustrate the proof of 
the spanning tree recursion, taking z = f. The graph Go on the left of Figure 3.22 has 19 
spanning trees; they are precisely the trees listed in Example 3.105 that do not use the edge 
f. Applying F to each of the remaining 12 spanning trees on the list produces the following 
subgraphs of G, (specified by their edge sets): 


{a,c,d}, {b,c,d}, {c,d,e}, {a,c,e}, {a,d,e}, {b,c,e}, 
{b,d,e}, {a,c h}, {a,d,h}, {b,c,h}, {b,d,h}, {c,d,h}. 


These are precisely the spanning trees of Gj. 

Next, we illustrate the calculation of 7(G) using the recursion. We first delete and 
collapse edge f, producing the graphs Gp and G; shown in Figure 3.22. We know that 
7T(G) = 7(Go) +7(G1). Deletion of edge g from Gp produces a new graph G2 (Figure 3.23), 
while collapsing g in Go leads to another copy of G,. So far, we have 7(G) = 27(G1)+7(G2). 
Continuing to work on G1, we see that deleting edge h leads to the graph G3 in Figure 3.23, 
whereas collapsing edge h leads to the graph G4 in that figure. On the other hand, deleting 
h from G2 leaves a disconnected graph (which can be discarded), while collapsing h from 
G2 produces another copy of G3. Now we have 7(G) = 37(G3) + 27(G4). Deleting edge e 
from G3 gives a graph that has two spanning trees (by inspection), while collapsing e in G3 
leads to G4 again. So 7(G) = 3(2 + 7(Ga)) + 27(G4) = 6 + 57(G4). Finally, deletion of d 
from G4, leaves a graph with two spanning trees, while collapsing d produces a graph with 
three spanning trees. We conclude that 7(G4) = 5, so 7(G) = 6 + 25 = 31, in agreement 
with the enumeration in Example 3.105. 


Next we extend the preceding discussion to rooted spanning trees in digraphs. 


3.109. Definition: Rooted Spanning Trees. Let G = (V,£E,«) be a digraph, and let 
ug € V. A spanning tree of G rooted at vp is a rooted tree T with root vp and vertex set 
V such that T (without the loop at vp) is a subgraph of G. Let 7(G, vo) be the number of 
spanning trees of G rooted at vo. 
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The notions of edge deletion and contraction extend to digraphs. This leads to the 
following recursion for counting rooted spanning trees. 


3.110. Theorem: Rooted Spanning Tree Recursion. Let vg be a fixed vertex in a 
digraph G, and let z be a fixed edge leading into vp. Let G; be the digraph obtained from 
G by deleting z. Let Gz be the digraph obtained from G by collapsing z, and let the new 
collapsed vertex in Gy be vj. Then 


T(G, vo) = T(G1, vo) + T(Ga, v9). 


Proof. We modify the proof of Theorem 3.107. As before, the two terms on the right side 
count rooted spanning trees of G that do not contain z or do contain z. One may check 
that if T is a rooted spanning tree using the edge z, then the graph obtained from T by 
collapsing z is a rooted spanning tree of Gz rooted at vj. Similarly, adding z to the edge set 
of a rooted spanning tree of G2 rooted at uj produces a rooted spanning tree of G rooted 
at Cie O 


3.111. Remark. Our results for counting undirected spanning trees are special cases of 
the corresponding results for rooted spanning trees. Starting with any graph G = (V, E,«), 
we consider the associated digraph obtained by replacing each e € E by two directed edges 
going in opposite directions. As in the proof of Theorem 3.73, there is a bijection between 
the set of rooted spanning trees of this digraph rooted at any given vertex vg € V and the 
set of spanning trees of G. In what follows, we shall only treat the case of digraphs. 


( 


3.18 The Matrix-Tree Theorem 


There is a remarkable determinant formula for the number of rooted spanning trees of a 
digraph. The formula uses the following modified version of the adjacency matrix of the 
digraph. 


3.112. Definition: Laplacian Matrix of a Digraph. Let G be a loopless digraph on 
the vertex set V = {vo,v1,..., Un}. The Laplacian matrix of G is the matrix L = (Li; :0< 
i,j <n) such that L,; = outdeg(v;) and L;; is the negative of the number of edges from v; 
to v; in G. We let Lo be the n x n matrix obtained by erasing the row and column of L 
corresponding to vg. The matrix Lp = Lo(G) is called the truncated Laplacian matrix of G 
(relative to vo). 


3.113. The Matrix-Tree Theorem. With the notation of the preceding definition, we 
have 
7(G, v9) = det(Lo(G)). 


We prove the theorem after considering two examples. 


3.114. Example. Let G be the digraph associated to the undirected graph in Figure 3.21. 
In this case, L;; is the degree of vertex i in the undirected graph, and L,; is the negative of 
the number of undirected edges between 7 and j. So 


4 0 —-2 -1 -1 
0 3 0 -2 -1 
L=|-2 0 3 O -1 
-1 -2 0O 3 0 
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FIGURE 3.24 
Digraph used to illustrate the Matrix-Tree Theorem. 


Erasing the row and column corresponding to vertex 0 leaves 


3. 0 -2 -Il 
0 38 O -!l 
—2 0 3 =O 
-1 -1 0 38 


Io = 


We compute det(Zo) = 31, which agrees with our earlier calculation of T(G). 


3.115. Example. Consider the digraph G shown in Figure 3.24. We compute 
4 


i i. a = ei 
2 =f © 0 
—~ 2 0 =) @ @ 
¢ 2 © -1.0 
4° 2 o- =) @ 
Ls . Bela & 1 o ol, 
— 0 @ £ © 
0 0 -3 4 0 
—1 6 0 —<8 4 i = a 
0 0 =< 0. oO 4 


and det(Zo) = 16. So G has 16 spanning trees rooted at 0, as can be confirmed by direct 
enumeration. We use the matrix Lo as a running example in the proof below. 


3.116. Proof of the Matrix-Tree Theorem. Write Lo = Lo(G). First we prove that 
T(G, vo) = det(Zo) in the case where indeg(vp) = 0. If vp is the only vertex of G, then 
T(G, vo) = 1 and det(LZo) = 1 by the convention that the determinant of a 0 x 0 matrix is 1. 
Otherwise, 7(G, vo) is zero, and Lo is a nonempty matrix. Using the condition indeg(vp) = 0 
and the definition of Lo, we see that every row of Zo sums to zero. Therefore, letting u be 
a column vector of n ones, we have Lou = 0, so that Lo is singular and det(Lo) = 0. 

For the general case, we use induction on the number of edges in G. The case where G 
has no edges is covered by the previous paragraph. The only case left to consider occurs 
when indeg(vp) > 0. Let e be a fixed edge in G that leads from some v; to vp. Let Gi be 
the graph obtained from G by deleting e, and let Gz be the graph obtained from G by 
collapsing e. Both graphs have fewer edges than G, so the induction hypothesis tells us that 


T(G1, vo) = det(Lo(Gi)) and 7(G2, vo) = det(Lo(Ge2)), (3.4) 


where vu is the new vertex created after collapsing e. Using Theorem 3.110, we conclude 
that 
7(G, v0) = det(Lo(G1)) + det (Lo(G2)). (3.5) 
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Next, we evaluate the determinant det(Lo(G)). We use the fact that the determinant of 
a matrix is a linear function of each row of the matrix. More precisely, for a fixed matrix A 
and row index i, let Aly] denote the matrix A with the ith row replaced by the row vector y; 
then det(Aly + z]) = det(A[y]) + det(A[z]) for all y, z. This linearity property can be proved 
directly from the definition of the determinant (see Definition 12.40 and Theorem 12.48 for 
details). To apply this result, write the ith row of Lp = Lo(G) in the form y + z, where 
z = (0,0,...,1,0,...,0) has a 1 in position 7. Then 


det(Lo(G)) = det(Lo[y]) + det(Lo|z]). (3.6) 


For example, if G is the digraph in Figure 3.24 and e is the edge from 2 to 0 (so i = 2), 
then y = (0,1,0,—1,0), z = (0,1,0,0,0), 


2 0 -1 0 0 o ). —1 0 0 
0 1 =a 0 1 0 00 
Loiyj=|0 0 1 0 Of, ZoeJ=|0 0 1 00 
0 0 -3 4 0 0 0 -3 4 0 
O.-20; Ge 2 (=) to: Oa 


Comparing equations (3.5) and (3.6), we see that it suffices to prove det(Zo(Gi)) = 
det(Lo[y]) and det(Lo(G2)) = det(Lo[z]). 

How does the removal of e from G affect L(G)? Answer: The i, i-entry drops by 1, while 
the 2, 0-entry increases by 1. Since the zeroth column is ignored in the truncated Laplacian, 
we see that we can obtain Lo(G1) from L(G) by decrementing the #, i-entry by 1. In other 
words, Lo(Gi) = Loly], and hence det(Lo(G1)) = det(Lol[y}). 

Next, let us calculate det(Zo[z]) by expanding the determinant along row i. The only 
nonzero entry in this row is the 1 in the diagonal position, so det(Lo[z]) = (—1)*** det(M) = 
det(M), where M is the matrix obtained from Lo[z] (or equivalently, from Lo) by erasing 
row 7 and column 2. In our running example, 


7 1. oO O 
ao ft 2.0 
Mg oe a 
0 0 01 


We claim that M = Lo(G2), which will complete the proof. Consider the k,j-entry of 
M, where k,j € {0,1,...,n}—{0,2}. If k = j, this entry is outdegg(v,;), which equals 
outdegg, (vj) because vj; is not vp, vj, or vg. For the same reason, if k 4 j, the k, j-entry of 
M is the negative of the number of edges from vz to vj, which is the same in G and G. 


DT 


3.19 Eulerian Tours 


3.117. Definition: Eulerian Tours. Let G = (V, E,¢) bea digraph. An Eulerian tour in 
G is a walk W = (vo, €1, U1, €2, V2,---,€n; Un) Such that W visits every vertex in V, and W 
uses every edge in F exactly once. Such a tour is called closed iff up, = vo. 


3.118. Example. For the digraph G shown in Figure 3.25, one closed Eulerian tour of G 
is 
Wi — (0, m, 2,1,5,e,1, a, 3, c¢, 4, b, 3, d, 5, f,4,9,5, k,0,1,4,h, 5, 7,0). 
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To specify the tour, it suffices to list only the edges in the tour. For instance, here is the 
edge sequence of another closed Eulerian tour of G: 


W2 = (1,9, 6, 0,0, f, 0,0, h, 7, m,t, k). 


3.119. Example. The digraph G shown in Figure 3.2 does not have any closed Eulerian 
tours, since there is no way to reach vertex 6 from the other vertices. Even if we delete 
vertex 6 from the graph, there are still no closed Eulerian tours. The reason is that no tour 
can use both edges leaving vertex 2, since only one edge enters vertex 2. 


The previous example indicates two necessary conditions for a digraph to have a closed 
Eulerian tour: the digraph must be connected, and also balanced in the sense that indeg(v) = 
outdeg(v) for every vertex v. We now show that these necessary conditions are also sufficient 
to guarantee the existence of a closed Eulerian tour. 


3.120. Theorem: Existence of Closed Eulerian Tours. A digraph G = (V, F,«) has 
a closed Eulerian tour iff G is connected and balanced. 


Proof. First suppose G has a closed Eulerian tour W starting at vo. Since W visits every 
vertex, we can obtain a walk from any vertex to any other vertex by following certain edges 
of W. So G is connected. Next, let v be any vertex of G. The walk W arrives at v via an 
incoming edge exactly as often as the walk leaves v via an outgoing edge; this is true even 
if v = vp. Since the walk uses every edge exactly once, it follows that indeg(v) = outdeg(v). 

Conversely, assume that G is connected and balanced. Let W = (vo, €1,U1,---,@n; Un) 
be a walk of maximum length in G that never repeats an edge. We claim that v, = vo. 
Otherwise, the walk W would enter vertex v, one more time than it leaves v,. Since 
indeg(vp) = outdeg(v,), there must be an outgoing edge from v, that has not been 
used by W. So we could use this edge to extend W, contradicting maximality of W. 
Next, we claim that W uses every edge of G. If not, let e be an edge not used by W. 
Since G is connected, we can find such an edge that is incident to one of the vertices 
vu; visited by W. Since vj, = vo, we can cyclically shift the walk W to get a new walk 
W! = (vj, €i41, Vit1;-+-;€n, Un = Vo, €1;---,€i, Vi) that starts and ends at v;. By adding the 
edge e to the beginning or end of this walk (depending on its direction), we could again 
produce a longer walk than W with no repeated edges, violating maximality. Finally, W 
must visit every vertex of G, since W uses every edge of G and (unless G has one vertex 
and no edges) every vertex has an edge leaving it. O 


Our goal in the rest of this section is to count the number of closed Eulerian tours in G 


FIGURE 3.25 
Digraph used to illustrate Eulerian tours. 
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FIGURE 3.26 
Rooted spanning trees associated to Eulerian tours. 


starting at a given vertex vo. Recall that 7(G, vo) is the number of rooted spanning trees of 
G rooted at vo. 


3.121. The Eulerian Tour Rule. Let G = (V, E,«) be a connected, balanced digraph. 
For each vg € V, the number of closed Eulerian tours of G starting at vo is 


7(G, vp) - outdeg(vo)!- |] (outdeg(v) — 1)! (3.7) 
vAVvo 


Let {vp,V1,---,;Un} be the vertex set of G. Let X be the set of all closed Eulerian tours 
of G starting at vo. Let SpTr(G, vo) be the set of spanning trees of G rooted at up. Let Y 
be the set of all tuples (T, wo, wi, W2,.-., Wn) satisfying these conditions: T € SpTr(G, vo); 
wo is a permutation of all the edges leaving vp; and, for 1 <i < n, w; is a permutation 
of those edges leaving v; other than the unique outgoing edge from v; that belongs to T 
(see Definition 3.42). By the Product Rule, the cardinality of Y is given by the right side 
of (3.7). So it suffices to define a bijection f: X ~ Y. 

Given an Eulerian tour W € X, define f(W) = (T,wo,...,Wn) as follows. For each 4 
between 0 and n, let wi be the permutation of all edges leading out of v;, taken in the 
order in which they occur in the walk W. Call wi the departure word of vertex v;. Next, set 
Wo = wo and for i > 0, let w; be the word w} with the last symbol erased. Finally, let T be 
the subgraph of G whose edges are given by the last symbols of w},...,w/,, augmented by 
a loop edge at vo. It is not immediately evident that T € SpTr(G, vo); we prove this shortly. 

Next we define amap g: Y + X that is the two-sided inverse of f. Fix (T,wo,...,Wn) € 
Y. For every i > 0, form w} by appending the unique edge of T leaving v; to the end of the 
word w,; let w) = wo. Starting at vp, we use the words wi to build a walk through G, one 
edge at a time, as follows. If we are currently at some vertex v;, use the next unread symbol 
in wi to determine which edge to follow out of v;. Repeat this process until the walk reaches 
a vertex in which all the outgoing edges have already been used. The resulting walk W is 
g(T, wo,-..-,;Wn). The edges occurring in W are pairwise distinct, but it is not immediately 
evident that W must use all edges of G; we prove this shortly. 

Once we check that f and g map into their stated codomains, the definitions just given 
show that fog and go f are both identity maps. Before proving that f maps into Y and g 
maps into X, we consider an example. 


3.122. Example. We continue the analysis of Eulerian tours in the digraph G from Ex- 
ample 3.118. The walk W, in that example has departure words wi = mi, wi =a, wh =, 
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ws = cd, wi = bgh, and wi = efkj. Therefore, 

f(Wi1) = (11, mt, -,-,¢, bg, efk), 
where - denotes an empty word and T) is the graph shown on the left in Figure 3.26. 


Similarly, for W2 we compute wi = im, wi, =a, wh = 1, wy = de, wi, = gbh, ws = ef jk, 
and 


f(W2) — (To, im, *y “d, gb, efj). 


We now calculate g(Ti,im,-,-,c,bg, fke). First, we use the edges of T, to recreate the 
departure words wj = im, wi = a, wh = 1, ws = cd, wi = bgh, and wh = fkej. We then 
use these words to guide our tour through the graph. We begin with 0,7,4, since 7 is the 
first letter of wi. Consulting w), next, we follow edge b to vertex 3, then edge c to vertex 4, 
then edge g to vertex 5, and so on. We obtain the tour 


W3 = (0,2, 4, b,3,¢,4,9,5, f,4,h, 5, k,0,m, 2,1,5,e, 1, a, 3, d, 5, j, 0). 
Similarly, we compute 
g(T2, mi, -,-,d, bg, j fe) = (70,1,.9,9, 090, F,.9;8; G0, Dyk). 


To complete the proof of Rule 3.121, we must prove two things. First, to show that 
f(W) €Y for all W € X, we must show that the digraph T obtained from the last letters 
of the departure words wi; (for 7 > 0) is a rooted spanning tree of G rooted at up. Since W 
visits every vertex of G, the definition of T shows that outdeg;(v;) = 1 for all i > 0. We 
need only show that T has no cycles other than the loop at vp (see Definition 3.42). We can 
view the tour W as a certain permutation of all the edges in G. Let us show that if e, h are 
two non-loop edges in T with e(e) = (x,y) and e(h) = (y, z), then e must precede h in the 
permutation W. Note that y cannot be vo, since the only outgoing edge from vp in T is a 
loop edge. Thus, when the tour W uses the edge e to enter y, the following edge in the tour 
exists and is an outgoing edge from y. Since h is, by definition, the last such edge used by 
the tour, e must precede h in the tour. Now suppose (20, €1, 21,---;€n;2n) is a cycle in T 
that is not the 1-cycle at vp. Using the previous remark repeatedly, we see that e; precedes 
e;+1 in W for all i, and also e, precedes e; in W. These statements imply that e, precedes 
itself in W, which is impossible. We conclude that f(W) € Y. 

Second, we must show that g maps Y into X. Fix (T,wo,...,wn) € Y and W = 
g(T, wo,---,Wn), and let w; be the departure word constructed from T and w;. We know 
from the definition of g that W is a walk in G starting at vo that never repeats an edge. 
We must show that W ends at vg and uses every edge in G. Suppose, at some stage in the 
construction of W, that W has just reached v; for some 7 > 0. Then W has entered v; one 
more time than it has left v;. Since G is balanced, there must exist an unused outgoing edge 
from v;. This edge corresponds to an unused letter in wi. So W does not end at v;. The 
only possibility is that W ends at the starting vertex vo. 

To prove that W uses every edge of G, we claim that it is enough to prove that W uses 
every non-loop edge of T. To establish the claim, consider a vertex v 4 vg of G. If W uses 
the unique outgoing edge from v that is part of T’, then W must have previously used all 
other outgoing edges from v, by definition of W. Since W ends at vg, W certainly uses all 
outgoing edges from vp. All edges are accounted for in this way, proving the claim. 

Finally, to get a contradiction, assume that some edge e in T from x to y is not used 
by W. Since T is a rooted tree rooted at vp, we can choose such an e so that the distance 
from y to vo through edges in T is minimal. If y vo, minimality implies that the unique 
edge leading out of y in T does belong to W. Then, as noted in the last paragraph, every 
outgoing edge from y in G is used in W. Since G is balanced, every incoming edge into y in 
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G must also appear in W, contradicting the assumption that e is not used by W. On the 
other hand, if y = vo, we see similarly that W uses every outgoing edge from y in G and 
hence every incoming edge to y in G. Again, this contradicts the assumption that e is not 
in W. This completes the proof of the Eulerian Tour Rule. 


Summary 


Table 3.2 contains brief definitions of the terminology from graph theory used in this chapter. 


e Facts about Matrix Multiplication. If A;,..., A, are matrices such that A; is nz_-1 X nt, 
then the 2, j-entry of the product A; A2--- A, is 


ny ng Ms—-1 


So So + SO Ai(é, ki) Aa(h1, ke) As (ke, ks) +++ As(es—1, 9). 


ky=1 kg=1 ks—-1=1 
If A’ = 0 (ie., A is nilpotent), then J — A is invertible, and 
(I-A) '=I+A+A?4+ AP +--+ AS, 


This formula applies (with s = n) when A is a strictly upper or lower triangular n x n 
matrix. 


e Adjacency Matrices and the Walk Rule. Given a graph or digraph G with vertex set 
{v1,...,Un}, the adjacency matrix of G is the matrix A such that A(i, 7) is the number 
of edges from v; to v; in G. For all s > 0, A*(i, 7) is the number of walks in G of length 
s from v; to vj. G is a DAG iff A” = 0, in which case A will be strictly lower-triangular 
under an appropriate ordering of the vertices. When G is a DAG, (I — A)~1(i, 7) is the 
total number of paths (or walks) from vu, to v5. 


e Degree Sum Formulas. For a graph G = (V, E,€), )o,cy degg(v) = 2|£]. For a digraph 
G= (V, BE, €), Sake indegg(v) = |E| _ re outdegg(v). 


e Functional Digraphs. For a finite set X, every function f : X — X has an associated 
functional digraph with vertex set X and edge set {(x, f(a)) : x € X}. Every functional 
digraph decomposes uniquely into one or more disjoint cycles together with disjoint 
rooted trees rooted at the vertices on these cycles. For each vertex x9 in a functional 
digraph, there exist unique walks of each length k& starting at 209, which are found 
by repeatedly following the unique outgoing edge from the current vertex. Such walks 
eventually reach a cycle in the functional digraph. 


e Cycle Structure of Permutations. For a finite set X,amap f:X — X is a bijection iff 
the functional digraph of f is a disjoint union of directed cycles. The signless Stirling 
number of the first kind, s’(n, k), counts the number of bijections f on an n-element set 
such that the functional digraph of f has k cycles. We have 


s'(n,k) =s'(n—1,k—-1)+(n—-1)s'(n-1,k) for0O<k<n. 


e Connectedness and Components. The vertex set of any graph or digraph G is the dis- 
joint union of connected components. Two vertices belong to the same component iff 
each vertex is reachable from the other by a walk. G is connected iff there is only one 
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TABLE 3.2 
Terminology used in graph theory. 


Brief Definition 

, #,€) where e(e v,w} means edge e has endpoints v, w 
(V, E,€) where e(e) = (v, w) means edge e goes from v to w 
graph with no loops or multiple edges 
digraph with no multiple edges 
G becomes H under some renaming of vertices and edges 
(vo, €1,U1,---,€s,Us) Where each e; is an edge from v;_1 to v; 
walk starts and ends at same vertex 
walk visiting distinct vertices 
closed walk visiting distinct vertices and edges, except at end 
digraph with no cycles 
number of edges leading to v in digraph G 


graph 
digraph 

simple graph 
simple digraph 
G2H 
walk 

closed walk 


outdega(v) number of edges leading from v in digraph G 

dega(v) number of edges incident to v in graph G (loops count as 2) 
isolated vertex vertex of degree zero 

leaf vertex of degree one 


simple digraph with outdeg(v) = 1 for all vertices v 
vertex in functional digraph that belongs to a cycle 
functional digraph with a unique cyclic vertex (the root) 
for all u,v € V(G), there is a walk in G from u to v 
edge belonging to no cycle of the graph G 

graph with no cycles 

graph with no cycles 

connected graph with no cycles 

all edges in graph go from A C V(G) to B C V(G) with 
ANB=90 

set M of edges in G where no two edges in M share an endpoint 


functional digraph 
cyclic vertex 
rooted tree 

G is connected 
cut-edge of G 
forest 

acyclic graph 
tree 

bipartite graph 


matching of G 


m(G) size of a maximum matching of G 

vertex cover of G set C’ of vertices where every edge of G has an endpoint in C 
uc(G) size of a minimum vertex cover of G 

N(S) set of vertices reachable from vertices in S by following one edge 
proper coloring map f : V(G) > C assigning unequal colors to adjacent vertices 
XG(2) number of proper colorings of G using x available colors 


chromatic number 
subgraph of G 
induced subgraph 
spanning tree of G 


least x with ye(x) > 0 

graph G’ with V(G’) C V(G), E(G’) C E(G) (same endpoints) 
subgraph G’ where all edges in G with ends in V(G") are kept 
subgraph of G that is a tree using all vertices 


7(G) number of spanning trees of G 
rooted spanning tree | rooted tree using all vertices of a digraph 
T(G, v0) number of rooted spanning trees of G with root vp 


Eulerian tour walk visiting each vertex that uses every edge once 
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component iff for all u,v € V(G) there exists at least one path from u to v in G. Deleting 
a cut-edge splits a component of G in two, whereas deleting a non-cut-edge has no effect 
on components. 


Forests. A graph G is a forest (acyclic) iff G has no loops and for each u,v € V(G), 
there is at most one path from u to v. A forest with n vertices and k edges has n — k 
components. 


Trees. The following conditions on an n-vertex simple graph G are equivalent and char- 
acterize trees: (a) G is connected with no cycles; (b) G is connected with at most n — 1 
edges; (c) G is acyclic with at least n — 1 edges; (d) for all u,v € V(G), there exists a 
unique path in G from u to v. An n-vertex tree has n — 1 edges and (for n > 1) at least 
two leaves. Pruning any leaf from a tree produces another tree with one less vertex and 
one less edge. 


Tree Counting Rules. There are n”~? trees with vertex set {1,2,...,n}. There are n”~? 


rooted trees on this vertex set rooted at 1. For dj +---+d, = 2(n —1), there are 
we) trees on this vertex set with deg(j) = d; for all 7. Bijective proofs of these 
facts use the following ideas: 


— Functions on {1,2,...,} fixing 1 and n correspond to rooted trees by arranging 
the cycles of the functional digraph in a certain order, breaking back edges, and 
linking the cycles to get a tree (see Figures 3.7 and 3.8). 


— Trees correspond to rooted trees by directing each edge of the tree toward the root 
vertex. 


— Trees with deg(j) = d; correspond to words in R(1“~1!---n4~!) by repeatedly 
pruning the largest leaf and appending the leaf’s neighbor to the end of the word. 


Matchings and Vertex Covers. For any matching M and vertex cover C of a graph G, 
|M| < |C|. If equality holds, then 1 must be a maximum matching of G and C must 
be a minimum vertex cover of G. We have m(G) < vc(G) for all graphs G, but equality 
does not always hold. 


Bipartite Graphs. A graph G is bipartite iff G has no cycles of odd length. For bipartite 
graphs G, m(G) = vc(G). If G has partite sets A and B, there exists a matching of G 
saturating A iff for all S C A, |S| <|N(S)|. 


Chromatic Polynomials. For any edge e in a simple graph G, the chromatic function 
of G satisfies the recursion xg = YG—{e} — Xa., where G—{e} is G with e deleted, 
and G. is G with e collapsed. It follows that y¢(x) is a polynomial function of x. The 
signed Stirling numbers of the first kind, s(n,k), are the coefficients in the chromatic 
polynomial for an n-vertex graph with an edge between each pair of vertices. 


Recursion for Spanning Trees. For any edge e in a graph G, the number 7(G) of spanning 
trees of G satisfies the recursion r(G) = 7(G—{e}) + 7(G_), where G—{e} is G with 
e deleted, and G, is G with e collapsed. A similar recursion holds for rooted spanning 
trees of a digraph. 


The Matriz-Tree Theorem. Given a digraph G and vp € V(G), let Li; = outdege(u), 
let —L;; be the number of edges from i to 7 in G, and let Lo be the matrix obtained 
from [L,;,;] by erasing the row and column indexed by vp. Then det(Lo) is 7(G, vo), the 
number of rooted spanning trees of G with root vo. 
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e Eulerian Tours. A digraph G has a closed Eulerian tour iff G is connected and balanced 
(indegree equals outdegree at every vertex). In this case, the number of such tours 
starting at vo is 


T(G, vp) - outdegg (vo)! - II (outdega(v) — 1)!. 
vA#VO 


The proof associates to each tour a rooted spanning tree built from the last departure 
edge from each vertex, together with (truncated) departure words for each vertex giving 
the order in which the tour used the other outgoing edges. 


Exercises 


3-1. Draw pictures of the following simple graphs. 

(a) C = ({1, 2,3, 4}, {{1, 2}, {1,3}, {1, 4}}) (the claw graph) 

(b) P = ({1,2,3,4}, {{1, 2}, {1,3}, {1,4}, {2,3}}) (the paw graph) 
(c) K = ({1,2,3,4}, {{1,2}, {1,3}, {1,4}, {2,3}, {2,4}}) (the kite graph) 
(d) B= ({1,2,3,4,5}, {1,2}, {2,3}, {1,3}, {1,4}, {2,5}}) (the bull graph) 

(e) Ky = ({1,2,...,n}, {{i, 7} : 1 <t< yg <n}) (the complete graph on n vertices) 
3-2. Let V be an n-element set. (a) How many simple graphs have vertex set V? (b) How 
many simple digraphs have vertex set V? 
3-3. Let V and E be sets with |V| = n and |E| = m. (a) How many digraphs have vertex 
set V and edge set FE? (b) How many graphs have vertex set V and edge set E? 
3-4. Let V be an n-element set. Define a bijection between the set of simple graphs with 
vertex set V and the set of symmetric, irreflexive binary relations on V. Conclude that 
simple graphs can be viewed as certain kinds of simple digraphs. 
3-5. Let G, H, and K be graphs. (a) Prove G & G. (b) Prove G = H implies H = G. 
(c) Prove G © H and H & K imply G & K. Thus, graph isomorphism is an equivalence 
relation on any given set of graphs. 
3-6. Find all isomorphism classes of simple graphs with at most four vertices. 
3-7. Find the adjacency matrices for the graphs in Exercise 3-1. 


3-8. Let G be the simple graph in Figure 3.10. For 1 < k < 8, find the number of walks of 
length & in G from vertex 1 to vertex 10. 

3-9. Let G be the graph in Figure 3.21. Find the number of walks in G of length 5 between 
each pair of vertices. 

3-10. Let G be the digraph in Figure 3.24. Find the number of closed walks in G of length 
10 that begin at vertex 0. 

3-11. Let G be a graph with adjacency matrix A. (a) Find a formula for the number of 
paths in G of length 2 from v; to v;. (b) Find a formula for the number of paths in G of 
length 3 from v; to v;. 

3-12. Consider the DAG G shown here. 
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(a) Find all total orderings of the vertices for which the adjacency matrix of G is strictly 
lower-triangular. (b) How many paths in G go from vertex 5 to vertex 1? 


3-13. A strict partial order on a set X is an irreflexive, transitive binary relation on X. 
Given a strict partial order R on a finite set X, show that the simple digraph (X, R) is a 
DAG. 


3-14. For each of the following sets X and strict partial orders R, draw the associated DAG 
and calculate the number of paths from the smallest element to the largest element of the 
partially ordered set. 

(a) X = {1,2,3,4,5} under the ordering 1<2<3<4<5. 

(b) X is the set of subsets of {1,2,3}, and (S,T) € Riff S CT. 

(c) X is the set of positive divisors of 60, and (a,b) € R iff a < b and a divides b. 
3-15. Let X = {1,2,...,m} ordered by 1 < 2 <--- <n. In the associated DAG, how many 
paths go from 1 to n? Can you find a combinatorial (not algebraic) proof of your answer? 


3-16. Let X be the set of subsets of {1,2,...,n} ordered by strict set inclusion. In the 
associated DAG, how many paths go from @ to {1,2,...,n}? 


3-17. Given a digraph G, construct a simple digraph H as follows. The vertices of H are 
the strong components of G. Given C, D € V(H) with C #4 D, there is an edge from C' to D 
in H iff there exists c € C and d € D such that there is an edge from c to d in G. (a) Prove 
that H is a DAG. (b) Conclude that some strong component C of G has no incoming edges 
from outside C, and some strong component D has no outgoing edges. (c) Draw the DAGs 
associated to the digraph G3 in Figure 3.1 and the functional digraph in Figure 3.5. 


3-18. (a) Find the degree multiset for the graph in Figure 3.10, and verify Theorem 3.35 
in this case. (b) Compute the indegrees and outdegrees at each vertex of the digraph in 
Figure 3.24, and verify Theorem 3.32 in this case. 


3-19. Find necessary and sufficient conditions for a multiset [d1,d2,...,d,] to be the degree 
multiset of a graph G. 


3-20. Consider the cycle graph C,, defined in Example 3.84. (a) What is deg(C;,)? (b) Show 
that any connected graph with the degree multiset in (a) must be isomorphic to C,,. (c) How 
many graphs with vertex set {1,2,...,} are isomorphic to C,,? (d) How many isomorphism 
classes of graphs have the same degree multiset as C;,? (e) How many isomorphism classes 
of simple graphs have the same degree multiset as C,? 

3-21. Consider the path graph P,, defined in Example 3.84. (a) What is deg(P,,)? (b) Show 
that any connected graph with the degree multiset in (a) must be isomorphic to P,,. (c) How 
many graphs with vertex set {1,2,...,n} are isomorphic to P,,? (d) How many isomorphism 
classes of graphs have the same degree multiset as P,,? 

3-22. Find two simple graphs G and H with the smallest possible number of vertices, such 
that deg(G’) = deg(H) but G $ H. 

3-23. Prove or disprove: there exists a simple graph G with more than one vertex such that 
the degree multiset deg(G) contains no repetitions. 
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3-24. Prove or disprove: there exists a graph G with no loops and more than one vertex 
such that the degree multiset deg(G) contains no repetitions. 


3-25. Given a graph G = (V, E,¢), we can encode the endpoint function € by a |V| x |E| 
matrix M, with rows indexed by V and columns indexed by FE, such that M(v,e) is 2 if e 
is a loop edge at v, 1 if e is a non-loop edge incident to v, and 0 otherwise. M is called the 
incidence matrix of G. Prove the Degree Sum Formula 3.35 by computing the sum of all 
entries of M in two ways. 


3-26. Draw the functional digraphs associated to each of the following functions f :X — X. 
For each digraph, find the set C' of cyclic vertices and the set partition {S, : v € C} 
described in Theorem 3.43. (a) X = {1,2,3,4}, f is the identity map on X; (b) X = 
{0,1,...,6}, f(x) = (a?+1) mod 7; (c) X = {0,1,...,12}, f(x) = (a@7+1) mod 13; (d) X = 
{0,1,...,10}, f(a) = 3a mod 11; (e) X = {0,1,...,11}, f(a) = 4a mod 12. 

3-27. Let X = {0,1,2,...,9}. (a) Define f : X > X by setting f(x) = (3a + 7) mod 10. 
Draw the functional digraphs for f, f~' and fo f. What is the smallest integer k > 0 such 
that fo fo---of (k factors) is the identity map on X? (b) Define g: X — X by setting 
g(x) = (2% + 3) mod 10. Draw the functional digraphs for g and go g. 

3-28. Let X be a finite set, let vo € X, and let f : X — X be any function. Recursively 
define 2,41 = f(@m) for all m > 0. Show that there exists i > 0 with x; = 29;. 


3-29. Pollard-rho Factoring Algorithm. Suppose N > 1 is an integer. Let X = 
{0,1,...,N — 1}, and define f : X + X by f(x) = (a7 + 1) mod N. (a) Show that the 
following algorithm always terminates and returns a divisor of N greater than 1. (Use the 
previous exercise.) 


Step 1. Set u = f(0), v = f(f(0)), and d = gcd(v — u, N). 
Step 2. While d= 1: set u = f(u), v= f(f(v)), and d= gcd(v — u, N). 
Step 3. Return d. 


(b) Trace the steps taken by this algorithm to factor N = 77 and N = 527. 


3-30. Suppose X is a finite set of size k and f : X > X is a random function (which means 
that for all z,y € X, P(f(x) = y) = 1/k, and these events are independent for different 
choices of x). Let ap € X, define a,41 = f(@m) for all m > 0, and let S be the least 
index such that 7g = x; for some t < S. (a) For each s > 0, find the exact probability that 
S > s. (b) Argue informally that the expected value of $ is at most 2Vk. (c) Use (b) to 
argue informally that the expected number of gcd computations needed by the Pollard-rho 
factoring algorithm to find a divisor of a composite number N is bounded above by 2N!/4. 


3-31. Let V be an n-element set, and let vp ¢ V. A function f : V > V is called acyclic iff 
all cycles in the functional digraph of f have length 1. Count these functions by setting up 
a bijection between the set of acyclic functions on V and the set of rooted trees on V U {v9 } 
with root vo. 

3-32. How many bijections f on an 8-element set are such that the functional digraph of f 
has: (a) five cycles; (b) three cycles; (c) one cycle? 

3-33. Let X be an n-element set. Let Y be the set of all functional digraphs for bijections 
f :X — X. How many equivalence classes does Y have under the equivalence relation of 
graph isomorphism? 

3-34. How many functional digraphs with vertex set {1,2,...,n} have a; cycles of length 
1, a cycles of length 2, etc., where }°, ia; =n? 

3-35. Referring to the proof of the Rooted Tree Rule, draw pictures of the set A of functions, 
the set B of trees, and the bijection ¢: A— B when n = 4. 
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3-36. Compute the rooted tree associated to the function below by the map ¢ in the proof 
of the Rooted Tree Rule. 


3-37. Compute the function associated to the rooted tree with edge set 


{(1, 1), (2, 12), (3, 1), (4, 3), (5, 10), (6, 17), (7, 15), (8, 7), (9,3), 
( 


1 
10,3), (11, 12), (12, 1), (13, 4), (14, 10), (15, 1), (16, 4), (17, 4)} 


by the map ¢~! in the proof of the Rooted Tree Rule. 


3-38. Formulate a theorem for rooted trees similar to Theorem 3.74, and prove it by ana- 
lyzing the bijection in the Rooted Tree Rule. 


3-39. Let G be the digraph in Figure 3.2. Use the algorithm in Theorem 3.52 to convert 
the walk 


W= (1,6,1,6,1, 4,3, f,5, m, 2, n,5,h,4,c,3, f,5, 7,4, 9,5, m, 2, k, 4) 


to a path in G from 1 to 4. 
3-40. What are the strong components of a functional digraph? 


3-41. Show that a connected graph G with n vertices has n edges iff G has exactly one 
cycle. 


3-42. Prove that a graph G is not connected iff there exists an ordering of the vertices of 
G for which the adjacency matrix of G is block-diagonal with at least two diagonal blocks. 


3-43. Prove Theorem 3.60. 

3-44. How many connected simple graphs have vertex set {1,2,3,4}? 

3-45. Prove that every forest is bipartite. 

3-46. How many connected simple graphs on the vertex set {1,2,3,4,5} have exactly five 
edges? 

3-47. Prove that a graph G with no odd-length cycles is bipartite by induction on the 
number of edges in G. 

3-48. How many bipartite simple graphs have partite sets A = {1,2,...,m} and B = 
{m+1,...,m+n}? 

3-49. Suppose G is a bipartite graph with c components. Count the number of decomposi- 
tions of V(G) into an ordered pair of partite sets (A, B). 


3-50. Suppose G is a k-regular graph with n vertices. (a) How many edges are in G? (b) If 
k > 0 and G is bipartite with partite sets A and B, prove that |A| = |B|. 


3-51. Fix k > 2. Prove or disprove: there exists a k-regular bipartite graph G such that G 
has a cut-edge. 


3-52. Suppose G is a graph, and G’ is obtained from G by deleting some edges of G. What 
is the relation between m(G) and m(G’)? What is the relation between vc(G) and uc(G’)? 


3-53. For each k > 0, give a specific example of a graph G such that vc(G) — m(G) = k. 


3-54. Find a maximum matching and a minimum vertex cover for the graph shown here. 
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3-55. For m,n > 1, the grid graph Gyn has vertex set {1,2,...,m} x {1,2,...,n} with 
edges from (i, 7) to (i+1,7) for l<i<m,1<j <n, and edges from (7,7) to (¢,7 +1) for 
1<i<m,1<j <n. For example, Figure 1.2 displays several copies of the graph G3,4. 
Find a maximum matching and a minimum vertex cover for each graph Gm,,, and hence 
determine m(Gmn) and vc(Gnn)- 

3-56. Suppose X is a finite set, and P = {),...,S,} is a collection of subsets of X. 
A system of distinct representatives for P is a list a,,...,a, of k distinct elements of X 
such that a; € S; for all 7. Prove that P has a system of distinct representatives iff for all 
IC ae ere 3 2 || < User Sil- 

3-57. An independent set of a graph G is a subset I of the vertex set of G such that no 
two vertices in J are adjacent. Let i(G) be the size of a maximum independent set of G. 
An edge cover of G is a subset C of the edge set of G such that every vertex of G is the 
endpoint of some edge in C’. Let ec(G) be the size of a minimum edge cover of G. Prove an 
analogue of the Lemma on Matchings and Vertex Covers for the numbers i(G) and ec(G). 
3-58. Show that J is an independent set of a graph G iff V(G)—I is a vertex cover of G. 
Conclude that 7(G) + ve(G) = |V(G)]. 

3-59. Let G be a graph with no isolated vertex. (a) Given any maximum matching M of 
G, use M to construct an edge cover of G of size |V(G)| — |M]. (b) Given any minimum 
edge cover C' of G, use C' to construct a matching of G of size |V(G)| — |C]. (c) Conclude 
that m(G) + ec(G) = |V(G)|. 

3-60. Use the preceding exercises to prove that for a bipartite graph G with no vertex of 
degree zero, i(G) = ec(G). 

3-61. Use Hall’s Matching Theorem to prove the Konig-Egervary Theorem. 

3-62. Prove that an n-vertex graph G in which every vertex has degree at least (n — 1)/2 
must be connected. 

3-63. Let G be a forest with n vertices and k connected components. Compute 
Vvev(a) deSc(v) in terms of n and k. 


3-64. The arboricity of a simple graph G, denoted arb(G), is the least n such that there 
exist n forests F; with V(G) = U;_, V(Fi) and E(G) = Uj_, E(/). Prove that 


EA) . 


arb(G) > max et 


where H ranges over all induced subgraphs of G with more than one vertex. (It can be 
shown that equality holds [94].) 


3-65. Show that any tree not isomorphic to a path graph P,, must have at least three leaves. 


3-66. Let T be a tree. Show that deg;(v) is odd for all v € V(T) iff for all e € E(T), both 
connected components of (V(T’), E(T’)—{e}) have an odd number of vertices. 


3-67. Helly Property of Trees. Suppose 7’, 7), ..., 7; are trees, each T; is a subgraph 
of T, and V(T;) N V(T;) 4 9 for all i, 7 < k. Show that Ne, V(T;) #9. 
3-68. Let G be a tree with leaves {v1,...,Um}. Let H be a tree with leaves {w1,...,Wm}. 
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Suppose that, for each 7 and j, the length of the unique path in G from v; to v; equals the 
length of the unique path in H from w; to w;. Prove G = H. 

3-69. For 1 <n < 7, count the number of isomorphism classes of trees with n vertices. 
3-70. (a) How many isomorphism classes of n-vertex trees have exactly three leaves? 
(b) How many trees with vertex set {1,2,...,} have exactly three leaves? 

3-71. How many trees with vertex set {1,2,...,n} have exactly k leaves? 

3-72. Let K,, be the complete graph on n vertices (see Exercise 3-1). (a) Give a bijective or 
probabilistic proof that every edge of K,, appears in the same number of spanning trees of 
Ky. (b) Use Cayley’s Theorem to count the spanning trees of K,, that do not use the edge 
{1,2}. 

3-73. Use Theorem 3.74 to find the number of trees T with V(T) = {1,2,...,8} and 
deg(T) = [3,3,3,1,1,1,1, 1]. 

3-74. Let t,, be the number of trees on a given n-element vertex set. Without using Cayley’s 
Theorem, prove the recursion 


3-75. (a) Use the pruning bijection to find the word associated to the tree 


= ({0,1,...,8}, {{1, 5}, {2, 8}, {3, 7}, {7, O}, {6, 2}, {4, 7}, {5,4}, {2, 4}}). 
(b) Use the inverse of the pruning bijection to find the tree with vertex set {0,1,...,8} 
associated to the word 1355173. 
3-76. Use the inverse of the pruning bijection to find all trees with vertex set {1,2,...,7} 
associated to the words in R(11334). 
3-77. Let G be the graph with vertex set {+1,+2,...,+n} and with an edge between i 
and —j for all i,j € {1,2,...,n}. (a) Show that any spanning tree in G has at least one 
positive leaf and at least one neaative leaf. (b) Develop an analogue of the pruning map 
that sets up a bijection between the set of spanning trees of G and pairs of words (u,v), 
where u € {1,...,n}"—1 and v € {-1,...,-n}"~1. Conclude that G has n?”—? spanning 
trees. 


3-78. Let xn(x) be the chromatic polynomial for the graph C, consisting of n vertices 
joined in a cycle. Prove that 


Xn(e) = (a — 1)" +(-1)"(a@-1) for alln > 2. 


3-79. Find the chromatic polynomials for the graphs in Exercise 3-1. 
3-80. Find the chromatic polynomial and chromatic number for the graph G2 in Figure 3.1. 
3-81. Find two non-isomorphic simple graphs with the same chromatic polynomial. 


3-82. A certain department needs to schedule meetings for a number of committees, whose 
members are listed in the following table. 


Committee Members 


Advisory Driscoll, Loomis, Lasker 


Alumni Sheffield, Loomis 

Colloquium Johnston, Tchaikovsky, Zorn 
Computer Loomis, Clark, Spade 

Graduate Kennedy, Loomis, Trotter 

Merit Lee, Rotman, Fowler, Sheffield 
Personnel Lasker, Schreier, Tchaikovsky, Trotter 


Undergraduate | Jensen, Lasker, Schreier, Trotter, Perkins 
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(a) What is the minimum number of time slots needed so that all committees can meet 
with no time conflicts? (b) How many non-conflicting schedules are possible if there are six 
(distinguishable) time slots available? (c) Repeat (a) and (b), assuming that Zorn becomes 
a member of the Merit Committee (and remains a member of the Colloquium Committee). 


3-83. Let K, be the complete graph on n vertices (see Exercise 3-1). (a) How many sub- 
graphs does K,, have? (b) How many induced subgraphs does K,, have? 


3-84. Prove that a graph G has at least one spanning tree iff G is connected. 
3-85. Fill in the details of the proof of Theorem 3.110. 
3-86. Use the Spanning Tree Recursion 3.107 to find 7(G) for the graph G, in Figure 3.1. 


3-87. Let T; and T2 be spanning trees of a graph G. (a) If e; € E(T,)—E(T2), prove there 
exists eg € E(T2)—E(T;) such that 


Ts = (V(G), (E(Ti)—fer}) U {e2}) 


is a spanning tree of G. (b) If e, € E(T\)—E(T2), prove there exists eg € E(T2)—E(T;) 
such that 

Ts = (V(G), (E(T2) U {e1})—{e2}) 
is a spanning tree of G. 


3-88. Fix k > 3. For each n > 1, let Gn be a graph obtained by gluing together n regular 
k-sided polygons in a row along shared edges. The figure below illustrates the case k = 6, 
n=6d, 


Let Go consist of a single edge. Prove the recursion 
7(Gn) = kt (Gn_-1) —T(Gn-2) for alln > 2. 


What are the initial conditions? 

3-89. Find m(G) and vc(G) for the graph G displayed in the previous exercise. 

3-90. Given a simple graph G, let G—v be the induced subgraph with vertex set V(G)—{v}. 
Assume |V(G)| = n > 3. (a) Prove that |E(G)| = (n — 2)7! Yvev (a) |lE(G—v)|. (b) Prove 
that, for v9 € V(G), degg(vo) = (n— 2)! Dvev(a) |E(G—-v)| — |E(G-vp)I. 

3-91. For each graph in Exercise 3-1, count the number of spanning trees by direct enu- 
meration, and again by the matrix-tree theorem. 


3-92. Confirm by direct enumeration that the digraph in Figure 3.24 has 16 spanning trees 
rooted at 0. 


3-93. Let G be the graph with vertex set {0,1}° such that there is an edge between v, w € 
V(G) iff the words v and w differ in exactly one position. Find the number of spanning 
trees of G. 


3-94. Let J be the m x m identity matrix, let J be the m x m matrix all of whose entries 
are 1, and let t,u be scalars. Show that det(tI — uJ) =t™ — mt™~1u. 


3-95. Deduce Cayley’s Theorem 3.71 from the Matrix-Tree Theorem 3.113. 
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3-96. Let A and B be disjoint sets of size m and n, respectively. Let G be the simple graph 
with vertex set AU B and edge set {{a,b}:a€ A,b € B}. Show that 7(G) = m"—!n™1. 


3-97. How many closed Eulerian tours starting at vertex 5 does the digraph in Figure 3.25 
have? 


3-98. Find necessary and sufficient conditions for a graph to have a (not necessarily closed) 
Eulerian tour. 


3-99. Consider a digraph with indistinguishable edges consisting of a vertex set V and a 
multiset of directed edges (u,v) € V x V. Formulate the notion of a closed Eulerian tour 
for such a digraph, and prove an analogue of Theorem 3.121. 


3-100. de Bruijn Sequences. Let A = {x1,...,2%,} be an n-letter alphabet. For each 
k > 2, show that there exists a word w = wow 1 --- Wyk _1 such that the n* words 


WiWit1+*+Witk—-1 (where 0 <i <n* and subscripts are reduced mod n*) 


consist of all possible k-letter words over A. 


3-101. The Petersen graph is the graph G with vertex set consisting of all two-element 
subsets of {1, 2,3,4,5}, and with edge set {{A, B} : AN B =O}. (a) Compute the number 
of vertices and edges in G. (b) Show that G is isomorphic to each of the graphs shown here. 


(c) Show that G is 3-regular. (d) Is G bipartite? (e) Show that any two non-adjacent vertices 
in G have exactly one common neighbor. 


3-102. Find (with proof) all k such that the Petersen graph has a cycle of length k. 


3-103. Given any edge e in the Petersen graph G, count the number of cycles of length 5 
in G that contain e. Use this to count the total number of cycles of length 5 in G. 


3-104. (a) Prove that the Petersen graph G has exactly ten cycles of length 6. (b) How 
many claws (see Exercise 3-1) appear as induced subgraphs of G? 


3-105. How many spanning trees does the Petersen graph have? 


rrr 


Notes 


Our coverage of graph theory in this chapter has been limited to a few enumerative topics. 
Systematic expositions of graph theory may be found in [13, 16, 17, 25, 52, 60, 130, 136]; the 
text by West is especially recommended. Roberts [109] gives a treatment of graph theory 
that emphasizes applications. 
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The bijection used to enumerate rooted trees in the Rooted Tree Rule 3.47 is due to 
Egecioglu and Remmel [28]. The original proof of Cayley’s Theorem appears in [21]. The 
pruning bijection described in §3.12 is due to Priifer [100]; the image of a tree under this 
map is often called the Priifer code of the tree. For more on the enumeration of trees, 
see [91]. 

Our discussion of the Kénig-Egervary Theorem is based on Rizzi’s proof [108]. The 
matrix-tree theorem for undirected graphs is often attributed to Kirchhoff [71]; Tutte ex- 
tended the theorem to digraphs [126]. The enumeration of Eulerian tours in the Eulerian 
Tour Rule 3.121 was proved by van Aardenne-Ehrenfest and de Bruijn [127]. 


A 


Inclusion-Exclusion, Involutions, and Mobius 
Inversion 


This chapter studies combinatorial techniques that are related to the arithmetic opera- 
tion of subtraction: inclusion-exclusion formulas, involutions, and Mobius inversion. The 
Inclusion-Exclusion Formula extends the Disjoint Union Rule to a rule for computing 
|S; U Sy U---US;,| in the case where the sets S; need not be pairwise disjoint. Involu- 
tions allow us to give bijective proofs of identities involving both positive and negative 
terms, including the Inclusion-Exclusion Formula. The chapter concludes by discussing a 
generalization of inclusion-exclusion called the Mobius Inversion Formula for Posets, which 
has many applications in number theory and algebra as well as combinatorics. 


(I 
4.1 The Inclusion-Exclusion Formula 


Recall the Disjoint Union Rule: if $1, 59,...,5, are pairwise disjoint finite sets, then 
|S; US2U-++USp| = [Si] + |S2])+---+ Sp]. 


Can we find a formula for |.S,U---US,| in the case where the given sets S; are not necessarily 
disjoint? The answer is provided by the Inclusion-Exclusion Formula, which we discuss now. 

We have already seen the smallest case of the Inclusion-Exclusion Formula. Specifically, 
if S and T are any two finite sets, the Union Rule for Two Sets states that 


|ISUT| =|S|+|T|-|SNTI. 


Intuitively, the sum |.S| + |Z'| overestimates the cardinality of |S UT| because elements of 
|S T| are included twice in this sum. To correct this, we exclude one copy of each of the 
elements in ST by subtracting |S. T]. 

Now consider three finite sets S, T, and U. The sum |S| + |Z| + |U| overcounts the 
size of |S UT UU| since elements in the overlaps between these sets are counted twice (or 
three times, in the case of elements z € SM TMU). We may try to account for this by 
subtracting |SN T|+|SOU|+ |ZNU| from |S|+|T|+ |U|. If « belongs to S and U but 
not T (say), this subtraction will cause x to be counted only once in the overall expression. 
A similar comment applies to elements in (SN T)—U and (TMNU)-—S. However, an element 
z € SONTNU is counted three times in || + |Z| + |U| and subtracted three times in 
ISNT|+|SNU|+|TNU|. So we must include such elements once again by adding the term 
ISA TOU. In summary, we have given an informal argument suggesting that the formula 


|ISUTUU|=|S|/+|/T)/+|U| —|SAT|—|Sau|—-|TOU|+|SaATNOU 


should be true. 
Generalizing the pattern in the preceding example, we arrive at the following formula. 
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4.1. The Inclusion-Exclusion Formula (General Union Rule). Suppose n > 0 and 
S1,..-,S, are any finite sets. Then 


|$1 US2U-+-U Spl = $>(-1)* S- Si, OSig NS, |. (4.1) 


k=1 1<i1 <ig <i <ig<n 
4.2. Example. If n = 4, the Inclusion-Exclusion Formula says that |S; U Sp U $3 U S4| 
equals 
[S| + |S] + [$3] + |.S4| 
—|519 Sg] — |.$19S3] — [$19 S4] — [S29 $3] — [S29 S4| — [S39 S4| 
+|S19S2N S3| + [S19 S27 S4| + [S19 S39 $4] + |S2N 539 S4| 
—|S19S2N 53M Sy. 


4.3. Remark. By setting I = {t1,i2,..., ix}, the Inclusion-Exclusion Formula can also be 


written 
()s: 
wel 


[S; Ue S,, | = y pe 
OATC{1,2,..., n} 


We now give a proof of the Inclusion-Exclusion Formula using induction. This proof may 
be omitted without loss of continuity. A more combinatorial proof of the formula will be 
given later (§4.7) after we discuss involutions. 


4.4. Proof of Inclusion-Exclusion by Induction. We prove that (4.1) holds for all 
n > 0 and all finite sets S,...,5, by induction on n. The formula reduces to |S;| = |S1| 
for n = 1, which is true. For n = 2, the formula becomes 


|S; U Se] = |Si| + |S2| — [$1.9 Sol, 


and this is the Union Rule for Two Sets proved previously (see 1.41). Now assume n > 2 and 
that formula (4.1) is already known to hold for any union of n—1 finite sets. Let $1,...,Sn 
be fixed finite sets. The union of n sets S; U---US,, can be regarded as the union of the 
two sets S = S; US2U---US;,_1 and T = S,,. Hence, by the Union Rule for Two Sets, 


ete = ele Stl (S ise real 


Since the set operations M and U obey the distributive law, we can write the subtracted 


term as 
(51 A Sn) U (SN Sp) U-++U (Sp—1 0 Sn)I, 


which is the union of the n — 1 finite sets S$; S, for i in the range 1 < i < n—1. So we 
can apply the induction hypothesis to this term, and to the first term |S; U---US;,_1|. We 
obtain 


|$.U---USp| = S>(-1)* S- ($3, +++ S:, | 
k=1 1<iy<+<in<n-1 
n-1 
+|Sn|— $5 (-1)7"? S- (Si, A Sn) N= (Si, ASn)]- 
j=l 1<i1 << <ij<n-1 


We modify the second line of this formula as follows. First, observe that 


Jj 
() (Si, A Sp) = $i, Si. +S, 1 Sp. 


r=1 
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Next, change the summation index by setting k = 7 +1 and defining i, = n. The full 
formula now reads 


n-1 
|S1.U---USp] = SO(-DFt SO [SiN NS, | 
k=1 1<i1 <-<ip<n 


+|Sal + > (-1)** S- Ge meosriey, | 
k=2 


1<i1<+<ip_i<ipan 


We can absorb |,S;,| into the sum on the second line by allowing & to range from 1 to n 
there. Also, letting k range from 1 to n in the first summation does not introduce any new 
terms. After making these adjustments, the only difference between the formulas on the 
first and second lines is that i, < n in the first line while iz = n in the second line. We can 
now combine the two summations to obtain 


n 


|$1.U---USp]=S0(-F* SO Si, A Ss, |, (4.2) 


k=1 1S <--<i, <n 
which is the required formula (4.1). This completes the induction proof. 


4.5. The Union-Avoiding Rule. Suppose S;,...,5;, are subsets of a finite set X. The 
number of elements x € X that belong to none of the S; is 


|X-(S, U-+-U Sa) = |X| + $0 (-1) pa Si, 1 Sig N+ ++ Sig. 
k=1 


1<i1 <ig<---<ig<n 


This equation follows by applying the original Inclusion-Exclusion Formula and the Differ- 
ence Rule 1.40. 


Intuitively, the preceding formula is applicable when we are trying to count objects in X 
that must simultaneously avoid a number of specified bad properties. Each set 5; consists 
of those objects in X that have the ith bad property (and possibly other bad properties 
too). We give examples of such counting problems in the next section. 

In many applications of inclusion-exclusion, the sizes of the various intersections |.S;, M 
Si, +++ S;,| (for fixed k) are all the same. When this happens, we have the following 
simplified version of the Inclusion-Exclusion Formula. 


4.6. Simplified Version of the Inclusion-Exclusion Formula. Let $),...,5, be finite 
sets. Suppose that for all k > 1, the intersection of any k distinct sets among the S;’s always 
has cardinality N(k). In other words, |.$;,9 Si, M--+.5i,| = N(k) for all choices of indices 
ay <ig < +++ <ap. Then 


n 


|S. U---US,| = > \(-1)*3 (7,) v0. 


k=1 


If all S; are subsets of a given finite set X, we also have 
|X—(51 U--+USp)| = |X] + 52(-1)* @ N(k). 
k=1 


These equations follow by substituting N(k) for each summand |5;, 9--- S;,| in the 
previous inclusion-exclusion formulas and noting (by the Subset Rule) that there are (7) 
such summands. 
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4.2 Examples of the Inclusion-Exclusion Formula 


We can use inclusion-exclusion formulas to enumerate complicated collections of objects 
that would be very difficult to count using only the rules in Chapter 1. We begin with some 
problems illustrating the Union-Avoiding Rule. The key to using this rule is to choose the 
sets S1,..., 5, so that each set S; consists of those objects in some big set X that have a 
certain bad property. We want to count the objects in X that avoid all of the bad properties. 
The set of all these good objects is X—(S, U---US,), and the cardinality of this set is 
given by the Union-Avoiding Rule. For this approach to succeed, we must be able to count 
the sets 5;,95;,9---S;, using other counting rules. 


4.7. Example: Bridge Hands. A bridge hand is a 13-element subset of a 52-card deck. 
A face card is a jack, queen, king, or ace. How many bridge hands have at least one of each 
kind of face card? To answer this question, let X be the set of all bridge hands; we know 
|X| = (73) by the Subset Rule. In this problem, a hand is bad iff it lacks a particular kind 
of face card. So we introduce the set S$; of all 13-card hands that do not contain a jack. 
Similarly, let Sg be the set of hands in X containing no queen; let S3 be the set of hands in 
X containing no king; and let $4 be the set of hands in X containing no ace. Then the good 
hands we are trying to count are precisely the members of the set X—(S1 U Sp U $3 U S4). 

To use the Union-Avoiding Rule, we must now compute the sizes of the various inter- 
sections $;, +++ S;,- Note that |Si| = ({§) since we can build a hand in $; by choosing 
13 cards out of the 48 non-jacks in the deck. Similarly, |S2| = |S3| = |S4| = Ges Next, 
[SiN S3| = G) since we can build hands in S$; S3 by choosing 13 cards out of the 44 
cards in the deck that are neither jacks nor kings. The same formula holds for all other 
intersections of the form S;, 9 S;,. Similarly, each intersection S;,5S;, 7 S;, has size Gas 
while $19 529539 S4| = eee Observe that the simplified version of the Union-Avoiding 
Rule can be used here, with N(k) = era, for k = 1,2,3,4. Thus, the answer to the 
original question is 


52 48 44 AO 36 
adi - enon eriee 
(i) ey +0(35) i - (1s) 8,971,619, 088 


Next, how many 13-card bridge hands have at least one jack, at least one queen, and at 
least one king, but do not contain any ace cards or spade cards? The last condition can be 
dealt with as follows: throw out the 13 + 4 — 1 = 16 aces and spades at the outset, leaving 
52 — 16 = 36 cards. An inclusion-exclusion argument like the one in the last paragraph now 


leads to the answer 
36 33 30 27 
- = = 11,530. 
6 3(13) +3(15) (3) 930, 511, 530 


4.8. Example: Words. How many words in X = R(1?273?---n?) never have two adjacent 


letters that are equal? Note first that |X| = (, a) = (2n)!/2” by the Anagram Rule. 


Next we must define sets 51,...,5, to model the bad properties that we need to avoid. 
For 1 < i <n, let S; be the set of words in X in which the two copies of letter i are 
adjacent to each other. Our goal is to count the words in X—(S; U---US;,). To do so, fix 
iy < tg < +++ < %, and consider the intersection 5;,--- S;,. Given a word w in this 
intersection, form a new word by replacing the two consecutive copies of 7; by a single copy 
of 7;, for 1 < j < k. This operation defines a bijection from S;, 1--- S;, onto the set 
R(1%2%---n%), where a; = 1 if i = i; for some 7, and a; = 2 otherwise. (The inverse 


Inclusion-Exclusion, Involutions, and Mobius Inversion 163 


bijection replaces each i; by two consecutive copies of i;.) By the Bijection Rule and the 
Anagram Rule, we conclude that 


1k + 2(n — k) = 
Oe NOR) = = —k)!/2”-”. 
pees 2 Chg] Gave meh) 
Sa (ee 
k n—-k 
This expression depends only on k, not on the indices 71,...,i,. Also, when k = 0, this 


expression reduces to |X |. Using the Simplified Inclusion-Exclusion Formula 4.6, we conclude 


that : 
n\ (2n — k)! 
|X—-(S1U-+-US,)| = S0(- @ eee 
k=0 
4.9. Example: Integer Equations. For given n, m, and b, how many integer sequences 
(21, Z2,---,2n) Solve the equation z1 + z2 +---+ 2%, = m and also satisfy 0 < z; < b for 
all 7? To answer this, let X be the set of all solutions to this equation with each z; € Z>o. 
For 1 <i <n, let S; be the set of solutions where z; > b. By the Integer Equation Rule, 
|X| = ues Next, fix indices 71,...,7, with 1 <i, <--- <i, <n. By subtracting 6+1 
from each z;,, we obtain a bijection from the set of sequences in S;,---7 S;, onto the 
set of sequences (y1,...,Yn) satisfying y; € Z>o for all i and y; +---+ yn =m—k(b+1). 


By the Bijection Rule and the Integer Equation Rule, we conclude that |$;,9---NS;,| = 
Ca 

m—k(b+1),n-1 
Inclusion-Exclusion Formula 4.6, we obtain the answer 


|X—(S1, U---US),)| = S-ye(") ("~ k(b+1)+n- _ 


n—-1 
k=0 


3 This quantity depends on k& but not on 71,...,7%. Using the Simplified 


For instance, ifn = 5, m = 20, and b = 6, the number of solutions is ) = sea) + 10('?) = 
826. 


4.3 Surjections and Stirling Numbers 


We now use inclusion-exclusion to prove the Surjection Rule, which was stated without 
proof in Chapter 1. 


4.10. The Surjection Rule. For m > n > 1, the number of surjections from an m-element 
set onto an n-element set is )0p9(—1)*(7)(n — k)™. 


Proof. Fix m > n > 1; we count the surjections with domain A = {aj,a2,...,@m} and 
codomain B = {bj,b2,...,b,}. Let X be the set of all functions f : A > B. By the 
Function Rule, |X| = n™. For 1 <i <n, let S$; consist of all functions f € X such that b; 
is not in the image of f. A function f € X is a surjection iff f belongs to none of the Sj. 
Thus, we must compute |X —(S) U---US;,)|. Consider one of the intersections $;,---75i,, 
where 1 < iy < tg < +++ <i, <n. By shrinking the codomain, each function f belonging 
to this intersection corresponds bijectively to an arbitrary function mapping A into the 
(n — k)-element codomain B—{b;,,b;,,...,6;, }. By the Function Rule, the number of such 
functions is (n—k)™, a value which depends on k but not on 71,...,%,%. Using the Simplified 
Inclusion-Exclusion Formula 4.6, we find that 


|X-($1 U---US,)| =n + (1) (;) (n —k)™. 
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We can absorb n™ into the sum by letting k start at 0, which gives the formula in the 
statement of the Surjection Rule. O 


In §2.13, we proved the following alternate version of the Surjection Rule: the number 
of surjections from an m-element set onto an n-element set is S(m,n)n!, where S(m,n) isa 
Stirling number of the second kind. Comparing this to the rule just proved, we obtain the 
following summation formula for Stirling numbers. 


4.11. Corollary: Summation Formula for Stirling Numbers of the Second Kind. 
Form>n> 1, 


4.4 Euler’s ¢ Function 


Our next illustration of inclusion-exclusion comes from number theory. Recall that for any 
positive integers a and b, gcd(a,b) denotes the greatest common divisor of a and b, which 
is the largest integer dividing both a and b. 


4.12. Definition: Euler’s ¢ Function. For each integer m > 1, let ¢(m) be the number 
of integers x € {1,2,...,m} such that gcd(#,m) = 1. 


For example, if m = 12, then the relevant integers x are 1, 5, 7, and 11, so ¢(12) = 4. 
The function ¢ is prominent in algebra and number theory and has applications to modern 
cryptography. 


4.13. Theorem: Product Formula for ¢(m). Suppose an integer m > 1 has prime 
factorization m = p{'ps5” --- pS", where the p; are distinct primes and each e; > 1. Then 


o(m) = [Tere —1)=m][G-1/pi). 


i=l 


Proof. Let X = {1,2,...,m}; for 1 <i <n, let S; be the set of x € X such that p; is a 
factor of x. By the Fundamental Theorem of Arithmetic, « € X is not relatively prime to m 
iff « and m have a common factor greater than 1 iff ¢ and m have a common prime factor. 
It follows that 

o(m) = |X —(S1 U So U-+-+U Sn)|, 


so we can compute ¢(m) using the Union-Avoiding Rule. The relevant inclusion-exclusion 
formula can be written 


? 


IX-(S,U-- US) = So (nl! 


TC{1,2,...,} 


rs 


te] 


using the convention that (),<9 5; is the set X. Fix a subset I = {%1) < +--+ < iz} © 
{1,2,...,n}, and consider the intersection 5;,---M S;,. An integer x € X lies in this 
intersection iff p;, divides x for 1 < j < k iff the product q = pi, pi, +++ pi, divides «x iff x isa 
multiple of g. Now, the number of multiples of qg between 1 and m is m/q = m/ |], pi- If 
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I =(@ and the empty product is interpreted as 1, this expression becomes m = |X|. Inserting 
these expressions into the formula above, we find that 


(<1)! 
Tier Pi 


Comparing to the theorem statement, we see that it suffices to prove that 


(2-2 aoe “ 


i=1 Pi IC{1,2,...,n} 


o(m) = |X—(8,U++-US,)| =m 


TC{1,2,...,n} 


This identity can be deduced from the Generalized Distributive Law (Exercise 2-16), or as 
a special case of Exercise 4-68. Here we give a proof by induction on n. When n = 1, 
both sides of the formula are 1 — 1/p;. Fix n > 1, and assume i ome —p;') = 
Dieu, = es iy (-1)"l/ Ther Bs is already known. Multiplying both sides by 1 — 1/pn, we 


n 


DY. de (—1)!4 
N=) ae 


i=l IC{1,2,....n—1} 


Using the distributive law, the right side becomes 


— 1)! 1) 


IC{1,....m—1} Tier Pi IC{1,....n—1} Tee 


By moving —1/p,, inside the second sum and replacing the summation index I by IU {n}, 
the second sum becomes 


(<1)! 

IC{1,....n}:ne€l Tier Pi 
On the other hand, we can think of the first sum as ranging over all J C {1,...,n} such 
that n ¢ I. The two sums can now be combined into a single sum over all subsets I of 
{1,...,n}, which completes the induction step. O 


4.14. Remark. Here is a sketch of an alternative derivation of the formula for é(m) that 
avoids inclusion-exclusion, but assumes some results from algebra and number theory. For 
any commutative ring R, we let R*% be the set of units in R, i.e., the set of « € R such that 
there exists y € R with ry = yx = 1p. The following facts must now be verified. First, if R 
and T are isomorphic rings, then |R*| = |T’*|. Second, given a product ring R x S, we have 
(Rx S')* = R* x S* and hence (by the Product Rule) |(R x S)*| = |R*|-|S*|. Third, 
gcd(x,n) = 1 iff there exist integers y, z with xy + nz = 1 iff a has a multiplicative inverse 
in the ring of integers modulo n. So ¢(n) = |(Z/nZ)*|. Fourth, by the Chinese Remainder 
Theorem, the rings Z/mnZ and Z/mZ x Z/nZ are isomorphic whenever gcd(m,n) = 1. 
Combining these four facts, we see that gcd(m,n) = 1 implies 


o(mn) = |(Z/mnZ)*| = |(Z/mZ x Z/nZ)*| = |(Z/mZ)*| - |(Z/nZ)*| = o(m)9(n). 


Iteration of this result gives 
opt pr) = |] oe) 
i=1 


whenever pj,..-,Pn are distinct primes. Thus, it suffices to evaluate @ at prime powers. 
A direct counting argument using the Difference Rule and the definition of ¢ shows that 
o(p°) = p* — p®-! = p**(p— 1) when p is prime and e > 1. So we obtain the first formula 
for d(n) given in Theorem 4.13. 
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4.5 Derangements 


The Inclusion-Exclusion Formula allows us to enumerate a special class of permutations 
called derangements. Intuitively, a derangement of 1,2,...,n is a rearrangement of these 
n symbols such that no symbol remains in its original position. The formal definition is as 
follows. 


4.15. Definition: Derangements. A derangement of aset S is a bijection f : S — S such 
that f(x) # x for all e € S. For n > 0, let D, be the set of derangements of {1,2,...,n}, 
and let d, = |Dn|. 


We have dp = 1 (since the empty function with domain and codomain 9 satisfies the 
definition of derangement), while d; = 0. To give more examples of derangements, let us 
identify an element f € D, with the word f(1)f(2)---f(n). Then dz = 1 since 21 is the 
unique derangement of {1,2}. The derangements of {1, 2,3} are 312 and 231, so that d3 = 2. 
The permutation 5317426 is a derangement of {1,2,3,4,5,6, 7}. 


4.16. Summation Formula for Derangements. For n > 1, the number of derangements 


of an n-element set is 
dy, =n! 3 (Uk 
eee. kl 
k=0 


Consequently, for all n > 1, d, is the closest integer to n!/e. 


Proof. Let X be the set of all permutations of {1,2,...,n}; by the Permutation Rule, 
|X| =nl!. For 1 <i<n, let S;={f ¢ X: f(i) =i}. The set D, consists of precisely those 
elements in X that belong to none of the S;, so D, = X—(S; U---US),). To apply the 
Union-Avoiding Rule, we must determine the size of each intersection $;, 1S;,9---N Si,, 
where 1 < iy < ig < +++ <i, <n. A permutation f € X belongs to this intersection iff f 
fixes 71,...,2, and permutes the remaining n — k symbols among themselves. The number 
of such permutations is (n — k)!. This number depends only on k and not on the indices 
i1,..-, 14%. Applying the Simplified Union-Avoiding Rule 4.6, we obtain 


” n if nl nm (¢_4)k 
dy = + S0(-1)* @ (n—k)l=nl+ eye =n'S- [td ~ 
k=1 k=1 : k=0 : 


To relate this formula to the expression n!/e, recall from calculus that for all real x, 


Taking « = —1, we see that 


Multiplying by n! and comparing to our formula for d,,, we see that 


Co _4\k 
n!/e—d, =n! S- = 


k=n+1 
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It now suffices to show that the right side of this formula is less than 1/2 in absolute value. 


Factoring out mE from each term in the series, we obtain 


1 1 1 1 
n+1 ere aa (n+2)(n+3) — (@+2n+3\n+d 


|n!/e —d,| = 


The series within the absolute values on the right side is an alternating series that converges 
to a sum strictly less than 1. Since n > 1, it follows that 


1 
V/e—dy — -1<1/2. O 
Inl/e~dy| <———-1<1/ 


The following table lists the first few values of dn. 


n | 0 
1 


2/3/4] 5] 6] 7 8 9 
dn 219 


1 
0] 1 44 | 265 | 1854 | 14,833 | 133,496 


Like any permutation, a derangement has a functional digraph consisting of a disjoint 
union of one or more cycles. A permutation is a derangement iff there are no 1-cycles in its 
functional digraph. This observation leads to the following recursion for derangements. 


4.17. First Recursion for Derangements. We have dp = 1, d,; = 0, and 
dn = (n—1)dn-1+(n—1)dn-2 ~ for all n > 2. 


Proof. Fix n > 2. Write the set of derangements D,, as the disjoint union of sets A and 
B, where A consists of those derangements in which n is involved in a cycle of length 2, 
and B consists of the derangements where n is in a cycle of length greater than 2. To 
build an object in A, choose the partner of n in its 2-cycle (n — 1 ways), and then choose 
a derangement of the remaining objects (D,—2 ways). To build an object in B, choose a 
derangement of the first n — 1 objects (D,-1 ways), consider the functional digraph of this 
derangement, and splice n into a cycle just before any of the n— 1 available elements. Since 
all original cycles have length at least 2, this construction will ensure that n appears in a 
cycle of length at least 3. The recursion now follows from the Product Rule and the Sum 
Rule. O 


This recursion expresses d,, in terms of the previous two derangement numbers. Our next 
result shows how to transform this formula into another recursion giving d, as a function 
of dn—1. 


4.18. Second Recursion for Derangements. We have dp = 1 and 
dn = ndy-1 +(—-1)" for alln > 1. 
Proof. We use induction on n. If n = 1, then 
dn = dy) =0=1-1+(-1)' = ndp_1 + (-1)”. 


Now assume n > 1 and that dp—1 = (n—1)dn—-2 +(—1)"~!. We can use this assumption to 
eliminate (n — 1)d,—2 in the derangement recursion proved earlier. We thereby obtain 


dn = (n ag 1)dy-1 + (n = 1)dy-2 = (n = 1)dn-1 + (dn—-1 Fs (—1)""7) = ndn—1 + (—1)”. 


This completes the induction step. O 
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4.6 Involutions 


In Chapter 2, we saw how bijections could be used to prove combinatorial identities. Some 
identities involve a mixture of positive and negative terms. One can use special bijections 
called involutions to give combinatorial proofs of such identities. We introduce this idea 
with the following binomial coefficient identity. 


4.19. Theorem. For all n > 1, 07_9(—1)*(Z) =0. 


Proof. The result can be proved algebraically by using the Binomial Theorem 2.8 to expand 
the left side of (—1+ 1)" = 0. To prove the identity combinatorially, let X be the set of all 
subsets of {1,2,...,n}. For each S € X, we define the sign of S$ to be sgn(S) = (—1)!5!. 
Since there are (j) subsets S$ of size k, and sgn(S') = (—1)* for all such subsets, we see that 


S sents) = 7-9#("). 


SEX k=0 


Thus we have found a combinatorial model for the left side of the identity to be proved, 
where the model involves signed objects. 

To continue, we define a function I: X — X as follows. Given S € X, let [(S) = SU{1} if 
1 ¢ S, and let (9) = S—{1} if 1 € S. For example, /({2,4}) = {1, 2,4} and I({1,3}) = {3}. 
Observe that I(1(S)) = S$ for all S € X; in other words, I o I = idx. Thus, I is a bijection 
that is equal to its own inverse. Furthermore, since |I(S)| = |S|+ 1, sgn(J(S)) = — sgn(S) 
for all S € X. It follows that I pairs each positive object in X with a negative object in X. 
Consequently, the number of positive objects in X equals the number of negative objects 
in X, and so })gcx sgn(S) = 0. O 


The general setup for involution proofs is described as follows. 


4.20. Definition: Involutions. An involution on a set X is a function J: X — X such 
that Io I = idx. Equivalently, I is a bijection on X and J = I~!. Given an involution J, 
the fixed point set of I is the set Fix([) = {x € X : I(x) = x}, which may be empty. If 
sen: X — {+1,—1} is a function that attaches a sign to every object in X, we say that I 
is a sign-reversing involution iff for all x € X—Fix(J), sgn(/(x)) = —sgn(z). 


4.21. Involution Theorem. Given a finite set X of signed objects and a sign-reversing 
involution J on X, 
S- sen(X) = S- sen(X). 
wEX «€Fix(I) 
Proof. Define 
Xt ={a € X—Fix(I): sen(z) = +1} and X~ = {x € X— Fix(I) : sgn(x) = —-1}. 


By definition, J restricts to Xt and X~ to give functions J+ : Xt 4 X~ and I~ : X~ > 
X? that are mutually inverse bijections. Therefore, |X*| = |X—| and 


S > sgn(X) > sgn(xv) + ye sgn(x) + S- sen(x) 


tex rEext rex veFix(l) 


[Xt] —|X7|+ S- sen(r) = > sen(x). O 


w€Fix(I) xv€Fix(I) 


I 


I 
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As a first illustration of the Involution Theorem, we prove a variation of Theorem 4.19. 


4.22. Theorem. For all n > 1, 


” ,(2n ,{2n—-1 

we) ae vane! 

Proof. Let X be the set of all subsets of {1,2,...,2n} of size at most n, and let the sign of a 
subset T be (—1)!7!. The left side of the identity to be proved is rex sgn(T). Next, define 
an involution J on X as follows. If T € X and1e€ T, let 1(T) =T-—{1}. IfT © X and1¢T 
and |T| < n, let 1(T) = TU {1}. Finally, if T ¢ X and 1 ¢ T and |T| = n, let I(T) = T. 
Note that J is a sign-reversing involution. The fixed points of J are the n-element subsets 
of {1,2,...,2n} not containing 1. There are Ca) such subsets, and each of them has sign 
(—1)". So} rerixcz) 880(T) is the right side of the identity to be proved. We complete the 
proof by invoking the Involution Theorem. O 


Before giving more examples of involutions, we introduce a convenient notational device. 


4,23. Definition: The Truth Function y. For any logical statement P, define y(P) = 1 
if P is true, and x(P) = 0 if P is false. 


4.24. Theorem. For all integers n > 0, 


et al = (-1)"”? & ,)x(n is even.) 


k=0 


(The right side is zero when n is odd, and the right side is (—1)*(7*) when n = 2k is even.) 


Proof. Let X be the set of all pairs (S,T), where S and T are subsets of {1,2,...,n} of 
the same size. Define sen($,7) = (—1)!°!. Then the left side of the identity to be proved 
is )'(s,ryex 8gn(S,T). We define an involution J on X as follows. Given (S,T) € X, let 4 
be the least integer in {1,2,...,n} (if there is one) such that either 1 ¢ S andi ¢ T, or 
t € S andi € T. In the former case, let [(S,T) = (SU {},T U {2}); in the latter case, 
let I(S,T) = (S—{i}, T—{i}); if no such i exists, let [(S,T) = (S$,T). For example, taking 
n = 6, we find that I({1,3,5}, {3,5,6}) = ({1,2,3, 5}, {2, 3, 5, 6}), I({1,2, 4}, {2,3,4}) = 
({1, 4}, {3, 4}), and ({3, 5, 6}, {1, 2, 4}) is in Fix(JZ). 

It is routine to check that I is a sign-reversing involution; in particular, the designated 
integer 7 in the definition of I(S,7') is the same as the i used to calculate I(I(.S,T)), so 
I(1(S,T)) = (S,T). By the Involution Theorem, 


seo = FS cae 


k=0 (S,T) €Fix(Z) 


Note that (S,7) € Fix(/) iff for every 7 € {1,2,...,n}, 7 lies in exactly one of the two sets 
S or T. Since S and T must have the same size, (5,7) is a fixed point of J iff n is even 
and |S| = |T| = n/2 and S = {1,2,...,n}—T. If n is odd, the fixed point set is empty, so 
the given signed sum of squared binomial coefficients is zero. If n is even, we can construct 
an arbitrary element of Fix(I) by choosing any subset S' of size n/2 and letting T be the 


complementary subset {1,2,...,n}—.S. Since there are ee) choices for S, each with sign 


(—1)"/?, the formula in the theorem is proved. O 
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4.25. Example: Stirling Numbers. Recall from §3.6 that s(n,k) = (—1)"~*c(n,k), 
where c(n, k) is the number of permutations of an n-element set whose functional digraph 
consists of k cycles. We will show that for all n > 1, 


n 
S- s(n, k) = x(n = 1). 
k=1 

Both sides are 1 when n = 1, so assume n > 1. Let X be the set of all permutations 
of {1,2,...,n}. If w € X is a permutation with k cycles, define sen(w) = (—1)*. Now 
SS wex Sgn(w) = (-1)" y_, s(n, k), so it suffices to define a sign-reversing involution I on 
X with no fixed points. Given w € X, the numbers 1 and 2 either appear in the same cycle 
of w or in different cycles. If 1 and 2 are in the same cycle, let the elements on this cycle 
(starting at 1) be 


(1 ai ier oe 54,2, Y1, Y2; eneris ed) 
where i, 7 > 0. Define I(w) by replacing this cycle by the two cycles 


(1,24 2s. nie »@3)(2, Yr, Yas eng ta) 


and leaving all other cycles the same. Similarly, if 1 and 2 are in different cycles of w, write 
these cycles as 
(1,21, 22, Se ,€4)(2, Y1, Y2, oe Yj) 


and define I(w) by replacing these two cycles by the single cycle 
(1 iy wa, sh Li, 2, Y1, Yo, eneris sty) 


It is immediate that J o J = idx, I is sign-reversing, and J has no fixed points. 
We can modify the preceding involution to obtain a combinatorial proof of the identity 


S- s(i, k)S(k, 7) = x(t = i) 


k>0 


which we proved algebraically in Theorem 2.64(d). Ifi < 7, then for every k, either s(i,k) = 0 
or S(k, 7) = 0. So both sides of the identity are zero in this case. If i = 7, the left side reduces 
to s(#,7)S(i,i) = 1 = x(t = 7). If j = 0, the identity is true. So we may assume 7 and j 
are fixed numbers such that i > j > 0. Let X be the set of pairs (w,U), where w is a 
permutation of {1,2,...,i} (viewed as a functional digraph) and U is a set partition of the 
set of cycles in w into 7 blocks. If w has k cycles, let sen(w,U) = (—1)*. Then 


4 


S- sgn(w, U) = (-1)' S s(, k)S(k, 3), 


(w,U)EX k=j 


and y(i = j) = 0. So it suffices to define a sign-reversing involution J on X with no fixed 
points. Given (w,U) € X, there must exist a block of U such that the cycles in this block 
collectively involve more than one point in {1,2,...,2}. This follows from the fact that 7 (the 
number of points) exceeds j (the number of blocks). Among all such blocks in U, choose the 
block that contains the smallest possible element in {1,2,...,i}. Let this smallest element 
be a, and let the second-smallest element in this block be b. To calculate I(w,U), modify 
the cycles in this block of U as we did above, with a and b playing the roles of 1 and 2. 
More specifically, a cycle of the form 


(Gigtee Wie Oy Hien Oe) 
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gets replaced (within its block) by 


(a,21,-..,Xr)(B, y1,---5 Ys) 


and vice versa. It is routine to check that J is a sign-reversing involution on X with no fixed 
points. For example, suppose i = 10, 7 = 3, w has cycles (1), (3,5), (2,6, 9), (4,8), (7), (10), 


and 

U = {{(1)}, {(3, 5), (10) f, {(2, 6, 9), (4,8), (7) }F- 
Here the block of U modified by the involution is {(2,6,9), (4,8), (7)}, a = 2, and b = 4. 
We compute I(w,U) by replacing the cycles (2,6,9) and (4,8) in w by the single cycle 
(2,6,9,4,8) and letting the new set partition be 


= {{(1)}, (3; 5), (10)}, {(2, 6, 9, 4, 8), (7) H- 


Note that the original object has sign (—1)° = +1, whereas I(w,U) has sign (—1)° = —1. 


? 


4.7 Involutions Related to Inclusion-Exclusion 


This section gives more examples of involution-based proofs. We begin by reproving the 
original Inclusion-Exclusion Formula with an involution. 


4.26. Involution Proof of the Inclusion-Exclusion Formula. Our goal is to prove 
formula (4.1). Moving all terms in this formula to the left side, we can rewrite the goal as 


So-pF = |S, NS, | = 0, (4.4) 


k=0 1<i <-<ig <n 


where the summand corresponding to k = 0 is defined to be |S; U--- US|. We will prove 
this formula by introducing an involution on a certain set of signed objects. 

Let X be the set of all sequences (x; %1,%2,...,7~) such that 1<%1 <ig<-+++< ip <1, 
O0<k<n,andxre $;,N---NS;,. (If &k = 0, then the object looks like (#;), and the last 
condition is interpreted to mean x € S; U---US,,.) Define sgn(2;i1, i2,...,i%) = (—1)*. It 
follows from the Sum Rule that >),- sgn(z) is the left side of (4.4). So it suffices to define 
a sign-reversing involution on X with no fixed points. 

Given z = (@3%1,...,%%) € X, we must have x € $; U---US,, no matter what the value 
of k is. Let i be the minimum index in {1,2,...,n} such that x € S;. By definition of X, we 
either have k = 0 or i < 41 ori =%. If k= 0 ori < iy, define [(z) = (#37, 1, 22,...,%%). If 
instead 7 = 71, define I(z) = (a; i2,...,%,). It is immediate that I([(z)) = z and sgn(J(z)) = 
—sgn(z) for all z € X. 


The preceding proof is very ingenious because it establishes a rather complicated formula 
by a remarkably simple bookkeeping bijection. On the other hand, we would also like to 
have a combinatorial proof of inclusion-exclusion that is tied more closely to the intuitive 
“including and excluding” arguments we used originally to guess the formula for |SUT UU]. 
For such a proof, see Exercise 4-69. 

The identities in the next lemma are needed in the next section to prove generalized 
versions of the Inclusion-Exclusion Formula. We prove each identity using an involution. 


4.27. Lemma. For all integers p, 7 > 0, 


@ Sv (") (2) x=. 0) a (71) (2) =xe27>0) 


k=j 
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Proof. (a) If p < j, both sides of (a) are zero. If p = j, both sides of (a) are 1. So assume 
p> Jj. Let X be the set of words in {0, 1, 2}” containing exactly j zeroes. For a word w € X 
containing s ones, let sgn(w) = (—1)°. We can evaluate >°,,.. sgn(w) by letting k be the 
number of letters in w equal to 0 or 1. We must have 7 < k < p. For a fixed k in this range, 
we can build a word w with j zeroes and k — 7 ones by picking k positions out of p available 
positions where the 0’s and 1’s will go, then picking j of these k positions to contain the 
zeroes. The sign of the resulting object is (—1)*~7. By the Sum Rule, we conclude that the 
left side of (a) is equal to )°,,c. sgn(w). 

To complete the proof of (a), we define an involution J: X — X with no fixed points. 
Given w € X, choose the minimal i with w; 4 0. Such an 7 must exist, since 7 < p. If 
w; = 1, replace this symbol by a 2. If w; = 2, replace this symbol by a 1. This produces 
a new word I(w) € X with the opposite sign as w. For example, if 7 = 3 and p = 7, 
1(0120110) = 0220110, where sgn(0120110) = (—1)? and sen(0220110) = (—1)?. Evidently 
I(I(w)) = w for all w € X, so I has no fixed points. 

(b) If 7 =0 orp < j orp=j > 0, the identity is true. So assume p > j > 0. Define X 
as in the proof of (a), and let Y be the subset of X consisting of those words w where there 
is no 1 to the left of the first 0. To build a word in Y with 7 zeroes, k — j ones, and p—k 
twos (for fixed k in the range j < k <p), first choose & positions out of p for the zeroes and 
ones; then put a zero in the first of these k positions; then choose 7 — 1 of the remaining 
k—1 positions to contain the remaining zeroes. The sign of each such object is (—1)*~/, so 
the Sum Rule shows that the left side of (b) is equal to }7,,-y sgn(w). 

We complete the proof of (b) by defining an involution I : Y — Y with exactly one 
fixed point. For most words w € Y, we form I(w) by looking for the rightmost position 
i with w; 4 0, and toggling the symbol in this position between 1 and 2. For example, 
1(2010120) = 2010110. The action of I produces a new object in the set Y with the opposite 
sign as w, except in the case where all nonzero letters in w precede all the zeroes in w. In 
this case, since w € Y, w must be the word consisting of p — 7 twos followed by 7 zeroes. 
This word is the unique fixed point of J, and its sign is +1. O 


a 


4.8 Generalized Inclusion-Exclusion Formulas 


Let $1, 52,...,5, be subsets of a finite set X. The inclusion-exclusion formulas presented 
earlier allow us to count the number of objects that belong to at least one of the sets Sj, 
as well as the number of objects in X belonging to none of the sets S;. Our next goal is to 
generalize these results to inclusion-exclusion formulas that count the objects in at least j 
of the sets S;, or exactly 7 of the sets S;, where j is a fixed integer between 1 and n. 


4.28. Rule for Counting Objects with Exactly 7 Properties. Let 5;,59,...,5, be 
distinct subsets of a finite set X. For fixed j € {1,2,...,n}, the number of objects belonging 
to exactly 7 of the sets S; is 


” _i{k 
eye ‘ by (Se, 18s, 12+. 8, |. (4.5) 
k=j J 1<i1 <ig<-<i,<n 


Proof. We start by rewriting (4.5) in more concise notation. Let [n] denote the set 
{1,2,...,n}. We encode the sequence of subscripts ij < ig < --- < ix as a k-element 
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subset I of [n]. Then (4.5) can be written 
” (k 
— 1)k-5 ( ‘) 
ea) 


k=j IC(n] 
|[|=k 


(iS 


ier 


Next, for any set S C X, observe that |S| = )7 cy x( € S). Since x € (),-, 9; iff x € S; 
for every 7 € I, the previous formula now becomes 


n 


So(-pk4 @ S> So xvie Lr eS). 


k=j IC[n] ceX 
[I|=k 


Using the distributive law and commutative law for sums of finitely many terms, we can 
rewrite this as 


Vay ) S> x(vi € I,x € Si)}. (4.6) 
weEX k=j J IC{[n] 
|I|=k 


For each « € X, define P(x) = {i € [n] : x € S;}, so |P(x)| tells us how many sets S$; 
contain z. To complete the proof, it suffices to show that the summand in (4.6) indexed by 
x evaluates to x(|P(x)| = 7). Fix « € X, and let p = |P(x)|. As k and I vary within the 
summand indexed by this x, we obtain a nonzero contribution to this summand iff 7 < k < p 
and I C P(x). So we want to show that 


yen (7) Si) Ges) 


The inner sum is the number of k-clement subsets of the p-element set P(x), which is (2) 


by the Subset Rule. So we are reduced to showing that Doar ea (5) @) = x(p = 9), 


which is the identity we proved in Lemma 4.27(a). Oo 


4.29. Rule for Counting Objects with at Least 7 Properties. Let 5), 52,...,5, be 
distinct subsets of a finite set X. For fixed j € {1,2,...,n}, the number of objects belonging 
to at least 7 of the sets S; is 


n 


So(-1)'3 (" ~ :) YS Bynsyneasy. (4.7) 


k=j a 1<i1 <ig<-<ip<n 
The proof is nearly identical to the one just given, so we leave it as an exercise. 


4.30. Example. How many permutations of [n] = {1,2,...,n} have exactly j fixed points? 
To solve this, let X be the set of all permutations of [n]. For 1 <i<n, let S;={fex: 
f(«) =i}. We saw in an earlier example that for any 11 < ig < +++ < ix, |Si,99;,0-+-AS;,| = 
(n — k)!. Since there are (7) choices for {i1,i2,..., ix}, formula (4.5) gives 


n 


k=j J 
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as the answer. Alternatively, we can solve this problem using derangements. We build a 
typical permutation of [n] with j fixed points by first choosing which of the 7 inputs will 


be fixed (there are e) possibilities), then choosing a derangement of the remaining n — j 
values (there are d,,_; possibilities). By the Product Rule, the answer is (") dn—j. Replacing 


dn—j; by the summation given in 4.16 shows that our two answers agree. If we instead ask 
for the number of permutations of [n] with at least 7 fixed points, (4.7) gives the answer 


aye (P-1\ (®\, yl oS CIES _ al 2 (-1)" 
deayn(5 1)! = Ga Leet - G0 mle tay 


k=j 


4.9 Mobius Inversion in Number Theory 


We conclude this chapter with an introduction to the theory of Moébius inversion, which 
generalizes the inclusion-exclusion techniques studied so far. We begin in this section by 
describing the number-theoretic Mobius function and the corresponding Mobius Inversion 
Formula. Later sections discuss the generalization of the Mobius function and inversion 
formula to posets. 
4.31. Definition: Number-Theoretic M6bius Function. Suppose m > 1 is an integer 
with prime factorization m = pj'ps?---p°", where n > 0, e; > 0, and the p;’s are distint 
primes. (We take n = 0 when m = 1.) The Mobius function yu: Zso + {—1,0,1} is defined 
by w(m) = 0 if e; > 1 for some i, and p(m) = (—1)” if e; = 1 for all 2. 

In other words, f4(m) is zero if m is divisible by the square of a prime; u(m) = +1 if m 
is the product of an even number of distinct primes; and u(m) = —1 if m is the product of 
an odd number of distinct primes. For example, 


w(1) =1, u(7) =-1, n(10) =1, (12) = 0, (30) = -1. 


The following lemma is the key to proving the Mobius Inversion Formula. Here and 
below, we use the notation d/m to mean that d divides m, i.e., m = cd for some integer 
c. The symbol 5* lire indicates a sum ranging over all positive divisors d of m. The reader 
should take care not to confuse the statement d|m with the rational number d/m. 


4.32. Lemma. For all integers m 2 1, > 7 gjm H(d) = x(m = 1). 


Proof. When m = 1, we have }74), #(d) = w(1) = 1 = x(m = 1). Suppose next that m > 1 
and m has prime factorization pj’ --- pf". Instead of summing p(d) over all divisors d of 
m, we may equally well sum over just the square-free divisors d of m, which give the only 
nonzero contributions to the sum. Examining prime factorizations, we see that there are 
2” such square-free divisors, which have the form [],;-~ pi as T ranges over all subsets of 
[n] = {1,2,...,n}. Therefore, 


ws Se (II) =". 
d|m TC[n] 1eT TC[n] 
Collecting together summands indexed by subsets T’ of the same size k, we conclude that 


Sad=+ Hey =H ({)ew=o, 
] 


d|m k=0 TC[n k=0 
|T|=k 
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where the last step follows from Theorem 4.19. oO 


4.33. Number-Theoretic Mobius Inversion Formula. Suppose f,g : Zs9 — R are 
functions such that for all positive integers m, 


m) = S~9(d) 
d|m 
Then for all m > 1, 
m) = )> f(m/d)u(d) = > f(a)ulm/d). 
d|m d|m 


Proof. We use the definition of f to expand the first claimed formula for g(m): 


So f(m/du@) =S5- SS a} ue@M= SS gua), 


d|m djm \e|(m/d) (c,d)ES 


where S = {(c,d) € Zso x Zyo : dlm and c|(m/d)}. It follows routinely from the definition 
of divisibility that 


S = {(c,d) : d|m and cd|m} = {(c,d) : clm and cd|m} = {(c,d) : clm and d|(m/c)}. 


Therefore, the calculation continues as follows: 


S2 SS gud) =So ale) | SY) ua) 


(c,d)ES c|m d|(m/c) c|m d|(m/c) 


S~ g(e)x(m/e = 1) = g(m). 


elm 


Ka) 
— 
& 
= 
— 
Q 
nN 
I 


The next-to-last step used Lemma 4.32 to simplify the inner sum. We conclude that 
(m) = S70 f(m/d)u(d) = S71 f(Dulm/d), 
d|m d|m 


where the final equality results by replacing the summation variable d by m/d. This is 
permissible, since m/d ranges over all positive divisors of m as d ranges over all positive 
divisors of m. oO 


To give examples of the Mobius Inversion Formula, we first introduce some functions 
that are studied in number theory. 


4.34. Definition: Number-Theoretic Functions 7, 0, and o2. For each positive integer 
m, define 
=i a(m) = Sod o2(m) =S_d?. 
d|m d|m d|m 

Thus, 7(m) is the number of positive divisors of m; a(m) is the sum of these divisors; and 
d2(m) is the sum of the squares of these divisors. 
4.35. Example. Taking m to be 1, 4, 7, 12, and 30, we calculate: 

r(1)=1, 7(4) =3, 7(7) = 2, 7(12) = 6, 7(30) = 8; 

o(1)=1, of4)=7, of7)=8, o(12)=28, 9 (30) = 72; 

o2(1)=1, o2(4)=21, o2(7)=50, o2(12) = 210, o2(30) = 1300. 
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If m has prime factorization p{! ---p&", then the divisors of m have the form pi .+ pin 


where 0 < f; < e; for all i. The Product Rule therefore gives 7(m) = [];_,(e; + 1) (build 
a divisor by choosing fi,..., fn). Using the Generalized Distributive Law (Exercise 2-16) 
and the Geometric Series Formula, it can also be checked that 


n ei peti_4 
a(m) — II s D;! = 2 4 
i=1 \ fi=0 init (Pi 


Applying the Mobius Inversion Formula to the definitions of T, 0, and o2, we obtain the 
following identities. 


4.36. Theorem. For all integers m > 1, we have 
1= Jo r(m/d)u(d); = m=) o(m/d)u(d); mm? = SY) 02(m/d)u(d). 
d|m d|m d|m 
The next result uses Mobius inversion to deduce information about Euler’s ¢ function. 
4.37. Theorem: ¢ and uw. For all integers m > 1, 
m=) od) andso 4m) = Js (d)(m/d). 
d|m d|m 
Proof. To prove the first formula, fix m > 1. For each divisor d of m, let 
Sa = {a € Zs9:1<a2< mand gcd(a,m) = d}. 


It is immediate that the m-element set {1,2,...,m} is the disjoint union of the sets Sq as 
d ranges over the positive divisors of m. Whenever d divides m, we have gcd(a,m) = d iff 
d divides x and gcd(a/d,m/d) = 1. It follows that division by d gives a bijection from the 
set Sq onto the set of numbers counted by ¢(m/d). Therefore, |Sa| = ¢(m/d). By the Sum 


Rule, 
m=} >|Sal = >> ¢(m/d) = 39 6), 
d|m d|m d|m 


where the last equality follows by replacing the summation variable d by m/d. Applying 
Mobius inversion (with f(m) =m and g(m) = ¢(m)), we obtain the second formula in the 
theorem. O 


Some applications of these results to field theory are presented in 812.6. 


(MR 


4.10 Partially Ordered Sets 


The Inclusion-Exclusion Formula 4.1 and the Mobius Inversion Formula 4.33 are special 
cases of the general Mobius Inversion Formula for partially ordered sets (posets). Before 
discussing this, we review some definitions and examples concerning posets. 

Recall from Definition 2.50 the definition of a relation and the notions of reflexive, 
irreflexive, symmetric, antisymmetric, and transitive relations. Given a relation R on a 
finite set X, the pair (X, R) is a digraph G with vertex set X and directed edge set R. 
Reflexivity means that every vertex of G has a loop edge; irreflexivity means that no vertex 
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of G has a loop edge. Symmetry means that the reversal of every edge is also an edge (so 
we can think of G as undirected); antisymmetry means that it is never true that a non-loop 
edge and its reversal are both in G. Finally, transitivity means that whenever there is a 
walk (x,y, z) of length 2 in G, the edge (, z) is also present in G. More generally, we see by 
induction that when R is transitive, there exists a walk from x to z in G of positive length 
iff the edge (x, z) is present in G. 


4.38. Poset Definitions. A partial order relation on X is a relation that is antisymmetric, 
transitive, and reflexive on X. A strict order relation on X is a relation that is transitive 
and irreflexive on X. A partially ordered set (poset) is a pair (X,<) where < is a partial 
order relation on X. A totally ordered set is a poset (X,<) such that for all x,y € X, either 
xesyory<z. 


4.39. Example. Let X = {1,2,...,n} and take < to be the usual ordering of integers. 
Then (X, <) is an n-element totally ordered poset. More generally, for any S C R, (S,<) is 
a totally ordered poset. 


4.40. Example: Boolean Posets. Let S be any set, and let X = P(S) be the set of 
all subsets of S. Then (X,C) is a poset, where A C B means that A is a subset of B. In 
particular, (P({1, 2,...,}), C) is a poset of size 2”. This poset is not totally ordered when 
n > 1, since the statements {1} C {2} and {2} C {1} are both false. 


4.41. Example: Divisibility Posets. Consider the divisibility relation | on Zso defined 
by alb iff b = ac for some c € Zo. Then (Zso, |) is an infinite poset. Given a fixed positive 
integer n, let X be the set of all divisors of n. Restricting | to X gives a finite poset (X, |). 
This poset is a totally ordered set iff n is a prime power. 


The next result shows that partial order relations and strict order relations are essentially 
equivalent concepts. 


4.42. Theorem: Partial Orders and Strict Orders. Let X be a set, let P be the set 
of all partial order relations on X, and let S be the set of all strict order relations on X. 
There are canonical bijections f: P— S andg:S— P. 


Proof. Let A = {(x,x) :  € X} be the diagonal of X x X. Define f : P — S by setting 
f(R) = R—A for each partial ordering R on X. Define g: S > P by setting g(T) =TUA 
for each strict ordering T on X. Viewing relations as digraphs as explained above, f removes 
self-loops from all vertices, and g restores the self-loops. One may now check that f does 
map P into S, g does map S into P, and f og and go f are both identity maps. O 


DT 


4.11 Mobius Inversion for Posets 


This section introduces Mobius functions for posets and proves a generalization of the 
Mobius Inversion Formula. 


4.43. Definition: Matrix of a Relation. Let X = {21,22,...,¢n} be a finite set, and 
let R be a relation on X. Define the matriz of R to be the n x n matrix A = A(R) with 
i,j-entry A;; = x(a, Ra;). 


Note that A(R) is the adjacency matrix of the digraph (X, R). 
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4.44, Theorem. Let < be a partial ordering of X = {21,...,vn}, and let < be the 
associated strict ordering of X (see Theorem 4.42). Consider the matrices Z = A(<) and 
N = A(<). Then Z =1I+N; N is nilpotent; Z is invertible; and 


Z1=I1-N+N?—-N+.---4+(-1)"'N™), (4.8) 


Proof. The matrix identity Z = I + N holds since (X,<) is obtained from (X,<) by 
adding loop edges at each « € X. Next, we claim that the digraph (X, <) is acyclic. For if 
(21, Z2,---, 2k, 21) were a directed cycle in this digraph, we must have 21 < zg <--+ < 2, < 
z1. Then transitivity gives z1 < 21, which contradicts irreflexivity. By Theorem 3.25, N is 
nilpotent. The statements about the inverse of Z now follow from Theorem 3.26, taking A 
there to be —N. O 


4.45. Definition: Mobius Function of a Finite Poset. Keeping the notation of the 
preceding theorem, define yp: X x X > Z by setting u(x;,7;) to be the i, j-entry of Z~'. 
The function pz is called the Mébius function of the poset (X,<). When discussing several 
posets at once, we sometimes write x or fi< or M(x,<) to denote the Mobius function of 
(X, <). 


4.46. Example. Let X = {1,2,3,4} with the total ordering 1 < 2 < 3 < 4. For this poset, 


we have 
1111 011 1 
011 1 001 1 
BBE) a gy ae SRI ecg, ge 4 
000 1 0 0 0 0 
The powers of N are 
00 1 2 000 1 
2 _|0 0 0 1 3 00 0 0 4 
a 00 0 0]? . 0 0 0 0 |’ at 
0 0 0 0 00 0 0 


The inverse of Z is 


ZI'=I-N+N?7-N= 


So oo = 
j=) 
me 
| 
jet 


So for 1,7 € X, wi, 7) =1if 7 =7, w(i,g) = —lLify =i+1, and p(t, 7) = 0 otherwise. 


4.47. Example: Mobius Function of a Totally Ordered Poset. The preceding ex- 
ample generalizes as follows. Let X = {1,2,...,n} with the ordering 1 <2 <---<n. We 
have Z;,; = 1 for 2 < j and Z;,; = 0 fori > j. For all i,j € X, let Mi; = 1 if 7 = 32, 
Mi; = —-lifj =i+1, and M;,; = 0 otherwise. A routine matrix calculation shows that 
ZM = MZ =I. So for this poset, 


Mat) =1, pw,i+1)=—-1, wig) =O for 7 Fi,i+1. 


4.48. Example: M6bius Function of a Boolean Poset. Consider the poset (X,C), 
where X consists of all subsets of [n] = {1,2,...,n}. In this example, we will index the 
rows and columns of matrices by subsets of [n]. For $,T C [n], the S,T-entry of Z is 1 
if S C T, and 0 otherwise. We claim that the inverse matrix M = Z~! has S,T-entry 


Inclusion-Exclusion, Involutions, and Mobius Inversion 179 


u(S,T) = (—1)!"-S! if S$ C T, and zero otherwise. To verify this, let us show that ZM = I. 
The S$, 7-entry of 7M is 


(ZM)(S,T)= S> 2(S,U)M(U,T)= SY (-1)irUl. 
UC{[n] U:SCUCT 


If S = T, this sum is 1; while if S Z T,, this sum is 0. Now consider the case where S ¢ T. 
Let S have a elements and T’ have a+ b elements, where b > 0. For 0 < c < b, the number 
of sets U with S C U C T and |T—U| = c is (°), since we can build such a subset U by 
choosing a subset of c elements from the b-element set T—S and removing these elements 
from T to get U. Grouping terms in the sum for (7M)(5,T) based on the size of |T—U], 
we see from Theorem 4.19 that 


(ZM)(8,T) = s-0"(’) = 6, 


c=0 
So the Mobius function for this poset is 
u(S,T) = (-1)'7-Sly(S CT) for all $,T C [n]. 
An alternate proof of this formula will be given in Example 4.61 below. 


4.49. Example: Mobius Function of a Divisibility Poset. Let n be a fixed positive 
integer, let X be the set of positive divisors of n, and consider the divisibility poset (X, |). 
There is a close relation between the number-theoretic Mobius function and the Mobius 
function wx for this poset. More precisely, we claim that 


p(d) = px(1,d) for all d dividing n. 


To verify this, let us work with matrices whose rows and columns are indexed by the 
positive divisors of n, considered in increasing order. As above, let Z be the matrix such 
that Za, = x(dle); let M be the inverse matrix, which is uniquely determined by Z; and 
let v be the row vector (u(d) : dln). The identity ) 7 aj #(d) = x(m = 1), which is valid 
for all m dividing n, can now be rewritten as the vector identity vZ = (1,0,...,0). This 
shows that v must be the first row of M. More generally, we show in Example 4.62 that 
ix (d, e) = p(e/d) whenever dle and e|n, whereas 1x (d, e) = 0 if d does not divide e. 


The next definition will be used to give a combinatorial interpretation for the values of 
the Mobius function. 


4.50. Definition: Chains in a Poset. Let (X,<) be a poset. A chain of length k in X is 
a sequence C' = (zo, 21,..-, 2%) of elements of X such that z9 < 21 <---+ < zp. We say that 
C is a chain from zo to zz and write len(C) = k. The sign of the chain C is sgn(C) = (—1)*. 


4.51. Theorem: Mébius Functions and Signed Chains. Let (X,<) be a finite poset. 
Given y,z € X, let S be the set of all chains in X from y to z. Then 


Mx,<)(y, 2) = S- sgn(C). 
Ces 


In particular, if y £ z, then px,<)(y, 2) = 9. 
Proof. We know from (4.8) that 


L(X,<) (y, 2) = So (-1)FN* (y, 2), 
k>0 
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where N is the adjacency matrix of the digraph G = (X,<). A chain of length k from y to 
z is the same as a walk (or path) of length k from y to z in G. By the Walk Rule 3.19, the 
number of such walks is N*(y,z), and each walk has sign (—1)*. The theorem now follows 
from the Sum Rule. O 


4.52. Theorem: Mobius Inversion Formula for Posets. Let (X,<) be a finite poset 
with Mobius function yw. For any functions f,g: X > R, 


Va € X, g(x) = N° f(y) iff Va € X, f(x) = > o(y)uly, 2) 


ySau ySau 


Proof. Let X = {x1,...,@n}, and define Z = A(<) and M = Z~! as in Theorem 4.44. Also 
define row vectors F' = [f(#i) ... f(@n)] and G = [g(a#1) ... g(an)]. The first formula in 
the theorem is equivalent to the matrix identity G = FZ, since G; = g(x;) and 


(FZ); = So FeLi = So f(@e)x(ae <2j)= S- f(y). 
k=1 k=1 


YSay 


Similarly, keeping in mind that p(y, x) 4 0 implies y < x, the second formula in the theorem 
is equivalent to the matrix identity F = GM. Since M and Z are inverse matrices, G = FZ 
is equivalent to GM = F. oO 


4.53. Example. In the special case where X = {1,2,...,n} with the ordering 1<2<-:-< 
n, Theorem 4.52 reduces to the following statement: given fi,..., fn € Rand gi,...,gn € R, 
we have (gc = fit foto thi for all i) iff (fi =91 and fi=9%-Gi-1 for 1 <i<n). 


4.54. Example. In the special case where X is the set of positive divisors of n ordered by 
divisibility, Theorem 4.52 reduces to the number-theoretic Mobius Inversion Formula 4.33, 
using the fact that wx(d,e) = u(e/d) when dle, and .x(d,e) = 0 otherwise. 


4.55. Example. In the special case where X = P((n]) ordered by containment of subsets, 
Theorem 4.52 reduces to the following statement: 


VT C [n],9(T) = S> f(S) iff VI C [In], (LT) = So (-)!7-S!9(8) 


SCT SCT 


If instead we use the opposite poset (X, 2), we obtain: 


VIC [n],9(T) = Do F(S) iff VP ¢ [n], F(T) = SO (-)®-7"9(8) 


SDT SDT 


We now use this result to rederive a version of the original Inclusion-Exclusion Formula. Let 
Z\,..-,Zn be given subsets of a finite set Z. For S C [n], let f(.S) be the number of objects 
z € Zsuch that z € Z; if and only if i € S. For SC [nl], let g(S) be the number of objects z € 
Z such that z € Z; ifi € S. Regarding Z; as the set of objects in Z with a certain property 
i, we can say that f(S) counts objects that have exactly the properties in S, whereas 
g(S) counts the objects that have at least the properties in S. It follows from this that 
WT) = gaz f(S) for all T, so Theorem 4.52 tells us that f(T) = Y957(—1)!9-7!9(S) 
for all T. Now, f(0) = |Z-(Z1 U---U Z,)| and g({i1,...,ie}) = |Zi, N--- Zi, |. The 
Union-Avoiding Rule 4.5 follows from these observations. 


Inclusion-Exclusion, Involutions, and Mobius Inversion 181 


FE 


4.12 Product Posets 


This section introduces a product construction for posets that leads to alternative deriva- 
tions of the Mobius functions for Boolean posets and divisibility posets. 


4.56. Definition: Product Posets. Let (X1,<1),...,(Xn,<n) be posets. Recall that 
the Cartesian product X = X, x --- x X, consists of all n-tuples « = (21,...,%n) with 
xj € X; for1 <i<n. For x = (a;) and y = (y;) in X, define x < y iff x; <; y; for 1 <i<n. 
The poset (X,<) is called the product of the posets (X;, <;). 


One immediately verifies that the relation < in the preceding definition really is a partial 
ordering on X. 


4.57. Example. Let X; = X2 = {1,2} with the ordering 1 < 2. Both X, and X92 are 
totally ordered posets, but X = X, x X is not totally ordered. For example, (1,2) and 
(2,1) are two incomparable elements of X. 


4.58. Theorem: Mobius Function for a Product Poset. Let (X,<) be the product 
of posets (X;,<;) for 1 <i <k. Given x = (x;) and y = (y;) in X, we have 


Proof. For brevity, write = M(x,<) and (4; = Hx,,<,)- By induction, we can reduce to the 
case k = 2. We have the matrices 


Z, = [x(ur <1 v1) 241,01 € Xi], My = [wi (un, v1) su, 01 € Xi], 
Zo = [x(u2 <2 v2) : U2, v2 © Xe], Mo = [Ue(ue2, v2) : U2, v2 € Xo], 
Z=([x(u<v):u,vEe X], M = [p(u,v): u,v € X], 


which satisfy Z7,M, = I, Z2.Mz = I and ZM = I. Define a matrix M’, with rows and 
columns indexed by elements of X, such that for u = (ui,u2) and v = (v1,v2) in X, 
the u,v-entry of M’ is 11 (u1, v1) M2(u2, v2). Note that the u,v-entry of Z is y((u1,u2) < 
(v1, V2)) = x(ur <1 V1)x(u2 <2 v2). The following computation verifies that 7M’ = I, and 
hence M’ = M since the inverse of Z is unique. For all u = (ui, u2) and w = (wy, we) in X, 


(ZM')(u,w) = S > Z(u,v)M'(v, w) 
vExX 


= DSO DS xlur <1 v1) x(u2 So v2) 1 (v1, wi) p2(v2, wa) 
v1 EX v2EX2 


= ( S- xX(u1 <4 niatonen)) : ( S- xX(u2 <2 eatea)) 


v1EX1 v2EX2 
= (4,M1)(u1, wi) - (Z2M2)(u2, we) 
= x(t =w1)x(u2=wa)=xXu=w). O 
4.59. Definition: Poset Isomorphisms. Given posets (X,<) and (X’,<’), a poset iso- 
morphism is a bijection f : X — X’ such that for all u,v € X,u<v iff f(u) <’ f(v). 
4.60. Theorem. If f : X — X’ is a poset isomorphism between (X,<) and (X’,<’), then 
for all u,v € X, 
wexr<y (fu), F(e)) = Mx,<) (u,v): 
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Proof. This follows from Theorem 4.51. The chains of a given length from u to v in (X, <) 
correspond bijectively to the chains of that length from f(w) to f(v) in (X’, <'); the bijection 
applies f to each element in the chain. O 


4.61. Example: Mobius Function of a Boolean Poset. Consider once more the 
divisibility poset X = (P([n]), C). For 1 <i <n, take Y; = {0,1} with the ordering 0 < 1, 
and let Y = Y, x---x Y,, be the product poset. There is a bijection f from P([n]) to {0, 1}” 
that sends a subset S' to the word f(S) = w = wi w2--- Wp with w; = 1 fori € S and w; = 0 
for 1 € S. One readily sees that f is a poset isomorphism, so wx (S,T) = uy(f(S), f(T)). 
Writing f(T) = z = 2122: Zn, Theorem 4.58 shows that py(w, z) = []j_, py, (wi, zi). As 
in Example 4.47, we see that 


by, (0,0) = wy,(1,1)=1; wy, (0,1) =—-1;  py,(1,0) =0. 


So w & z implies py(w, z) = 0. If w < z and z has k more 1’s than w does, we see that 
py (w, z) = (—1)*. Translating back to subsets via f~', this says that x (5,7) = 0 when 
S ZT, and px(S,T) = (—1)!7~-5! when SCT. 


4.62. Example: Mobius Function of a Divisibility Poset. Let n be a fixed positive 
integer with prime factorization n = pj! ---p;", and consider the divisibility poset (X, |), 
where X = {d € Zo: dln}. For 1 <i < k, let Y; = {0,1,...,n;} with the ordering 
0<1<.--- < n;, and take Y to be the product poset Y; x --- x Y,. Any d € X has 
prime factorization d = pe .: par where 0 < d; < n; for 1 <i < k. The map sending d to 
(d1,...,dx) is readily seen to be a poset isomorphism from X to Y. So 


k 
ux(d,e) = py ((d1,..., dx), (€1,---,€k)) = [1 Gi. es). 


As in Example 4.47, we see that py, (di,e;) = x(ei = di) — x(ei = dj + 1). It follows that 
px (d,e) = 0 unless e is obtained from d by multiplying by a set of s distinct prime factors 
chosen from {pi,...,px}, in which case zx (d,e) = (—1)°. It is now routine to check that 
whenever dle, ux (d,e) = (e/d), where py is the number-theoretic Mobius function. 


Summary 


1. Inclusion-Exclusion Formulas. Let S;,...,5;, be subsets of a finite set X. 
e General Union Rule: 
\S1US2U---USn]=SO(-1)** ST Si NS N+ N Sig 


k=1 1<iy <ig<s-<ipSn 


e Union-Avoiding Rule: 


|X—(S1 U-+-USp)| =|X|+ So (-1* > [Si, Si, +++ 1S%, |. 
k=1 


1<i1 <ig<-+-<ipn<n 


e Number of Objects in Exactly j of the Sets S;: 


ee (*) > Si, 1S A Sy. 


1<i1 <ig<-+-<ip<n 
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e Number of Objects in at Least j of the Sets Sj: 


” of b= 
So(-1)3 . = i) S- |Si, NM Sin MA) 55, |- 


k=j 1<i1 <ig<-+-<ip<n 


e Simplified Versions: Often |S;,---S;,| is some value N(k) depending only 
on k, not on 71,...,7%. In this case, each sum over 71,...,7%% in the preceding 
formulas can be replaced by (7) N(k). 


Applications of Inclusion-Exclusion. 

e Surjections and Stirling Numbers. For m > n > 1, the number of surjections 
from an m-element set onto an n-element set is >, (—1)*(Z)(n —k)™. A sum- 
mation formula for the Stirling number of the second kind is 


ie n—k)™ 
S(m,n) roe. 


e Euler’s @ Function. For m > 1, 6(m) is the number of integers x with 1 < « <m 
and gcd(a,m) = 1. We have ¢(m) = OT Lott —p'), where the product ranges 
over all prime divisors p of m. For m = p® with p prime, ¢(p°) = p® — p®—!. If 
gcd(m, n) = 1, then (mn) = (m)d(n). For m > 1, odie o(d) =m. 


e Derangements. A derangement of S is a bijection f : S > S with f(a) 4 x for 
all x € S. Let d, be the number of derangements of an n-element set. Then d,, = 
n! )-7—9(—1)*/k!, which is the closest integer to n!/e. Moreover, the numbers dp, 
satisfy the recursions 


dn = (n—1)dn-1 + (n—1)dn-2 for all n > 2; 


dn = ndy—-1 +(—1)” for alln>1. 


Truth Function x. For any statement P, y(P) = 1 if P is true, and x(P) = 0 
if P is false. 

Involutions. An involution is a function J: X > X with IoJ = idy. The 
fixed point set of J is Fix(1) = {a © X : I(x) = x}. When X consists of signed 
objects, I is sign-reversing iff sgn(J(a)) = —sgn(a) for all « € X—Fix(J). For a 
sign-reversing involution J with domain X, 


S> sgn(x) = x sen(a). 


rEX «€Fix(I) 


Involutions provide combinatorial proofs of identities that involve signed terms. 
Mobius Functions and Posets. 

e Number-theoretic Mobius Function. Define wy: Zs9 > {—1,0,1} by p(n) = 
(—1)* if n is the product of s > 0 distinct primes, and p(n) = 0 otherwise. Then 
alm H(d) = x(m = 1). Given functions f and g such that f(m) = d Jam g(a) for 
all m > 1, the number-theoretic Mobius Inversion Formula states that 


g(m) = S~ f(m/d)u(d) = S> f(d)e(m/d) for all m > 1. 


d|m d|m 
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It follows that $(m) = 7am H(d)m/d. 

e Posets. A partial ordering of X is a relation < on X that is reflexive, anti- 
symmetric, and transitive; the pair (X,<) is called a poset. A strict ordering of 
X is a relation < on X that is irreflexive and transitive. There is a bijection 
mapping partial orders on X to strict orders on X defined by removing the di- 
agonal {(a,2) : « € X}. A chain of length k in a poset (X,<) is a sequence 


sign (—1)*. 

e Mobius Functions for Posets. Given a poset (X,<) with X = {a1,...,%n}, 
define n x n matrices Z, N, and M by Z;; = x(ai < 2;), Niy = x(ai < @;), and 
M,; = the signed sum of all chains in the poset from x; to x;. Then Z =I + N; 
N is nilpotent; and M is the matrix inverse of Z. We write u(x;,2;) = Mi; and 
call 4 the Mobius function of the poset (X,<). Suppose f and g are functions 
with domain X. The Mobius Inversion Formula for Posets states that 


g(x) = y, f(y) foralae xX iff f(x) = S> aly) uly, ©) for alla € X. 


ySa ySau 


e Product Posets. Given posets (X;,<;) for 1 < i < n, the product set X = 
X1xX-+++x X,, becomes a poset by defining (71,...,%n) < (yi1,---; Yn) if ai <i Yi 
for all 1 between 1 and n. The Mobius function for the product poset satisfies 


Ux ((X1, oh :2n); (yi, cee sUn)) = [[ +x. (225%): 
i=l 


e Examples of Mobius Functions. The poset X = {1,2,...,n} with the total 
ordering 1 < 2 <---<m has Mobius function 


ux(i,2)=1, wx(@,it+1)=-1, ux(i,j)=0forj #ii+1 


The Boolean poset (P(X), C) of subsets of {1,2,...,n} ordered by inclusion has 
Mobius function 


u(S,T) = (-1)'7-Sly($ CT) for S,T C [n]. 


If n has prime factorization p{!---p,", then the poset of positive divisors of n 
under the divisibility ordering has Mobius function 


oe (—1)* if e/d is a product of s distinct primes; 
Pe 0 otherwise. 


These results follow since the Boolean poset is isomorphic to the product of n 
copies of the totally ordered set {0, 1}, whereas the divisibility poset is isomorphic 
to the product poset {0,1,...,mi} x--: x {0,1,..., nx}. 


(Me 
Exercises 


4-1. Given that |S] = 15, |T| = 13, |U] = 12, |SNT| =6, |SNU| =3, [TOU] =4, and 
ISAT NU| = 1, find: (a) |S UT; (b) |S UT UU; (c) the number of objects in exactly one 
of the sets $, T, U. 
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4-2. Given S,T,U C X with |X| = 35, |S| = 12, |T| = 14, |U| = 15, |SNT|=5=|SnU, 
IT NU| = 6, and |(SUT)NU| = 9, find: (a) |SA TOU; (b) |X-(SUTUU)); (c) the 
number of objects in exactly two of the sets S, T, U. 

4-3. List all the derangements of {1, 2,3, 4}. 

4-4. Compute dio in four ways: (a) by rounding 10!/e to the nearest integer; (b) by using 
the summation formula in 4.16; (c) by using the recursion in 4.17; (d) by using the recursion 
in 4.18. 

4-5. Compute ¢(n), 4(n), T(n), and a(n) for the following choices of n: (a) 6; (b) 11; (c) 28; 
(d) 60; (e) 1001; (f) 121. 

4-6. Verify Theorem 4.37 by direct calculation for (a) m = 24; (b) m = 30. 

4-7. Given n married couples, how many ways can the n men and n women be paired so 
that no pair consists of a man and his wife? 

4-8. How many five-card poker hands have at least one card of every suit? 

4-9. How many five-card poker hands have at least one face card, at least one diamond, 
and do not contain both a 2 and a 3? 

4-10. How many ten-digit numbers contain at least one 4, one 5, and one 7? 

4-11. How many bridge hands are void in clubs and have at least one card of value p for 
each prime p < 10? 

4-12. A keyboard has 26 lowercase letters, 26 uppercase letters, 10 digits, and 32 special 
characters. How many n-letter passwords contain at least one character from each category? 
4-13. How many ways can we place m labeled balls into n labeled boxes such that at least 
one box is empty? 


4-14. How many ways can we place m identical balls into n labeled boxes such that at 
least one box is empty? Solve this problem in two ways, and thereby deduce a binomial 
coefficient identity. 

4-15. (a) How many n-letter words using the alphabet {a1,...,a,} (where k > 3) contain 
at least one copy of a1, a2, and a3? (b) Repeat (a) assuming all letters in the word must be 
distinct. 

4-16. How many surjections f : {1,2,...,m}— {1,2,...,n} have the property that f(z) = 
1 for exactly one x? 

4-17. How many ways can we put thirty identical balls into eight labeled boxes if the first 
four boxes must each contain at most five balls and the last four boxes must each contain 
at least two balls? 

4-18. For even n > 2, determine the number of integers « < n with gced(a,n) = 2. 

4-19. For k > 0 and m > 1, let ox(m) = Van d®. (a) Find a formula for o,(m) in terms 


k involving ox and p. 


of the prime factorization of m. (b) Find a formula for m 
4-20. Use Theorem 4.13 to show that ¢(mn) = ¢(m)d(n) iff gcd(m,n) = 1. 


4-21. Explicitly compute how the first involution discussed in Example 4.25 matches up 
the 24 objects counted by eee s(4,k) into pairs of objects of opposite sign. 


4-22. Suppose w has cycles (1), (2), (3,8,7), (5,6,9), (4), and 


U = {{()}, ((2)}, (4), 6, 6, 9)F, (3,8, DEF 


Compute I(w,U), where I is the involution defined at the end of Example 4.25. 
4-23. Consider the derangement w = 4386215 € Dg. Find the six derangements in D7 and 
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the seven derangements in Dg that can be built from w by the construction in the proof of 
Theorem 4.17. 

4-24. Use the recursion for derangements in 4.18 to give a proof by induction of the sum- 
mation formula for derangements in 4.16. 

4-25. Give the details of the proof of Theorem 4.42. 

4-26. (a) Give an algebraic proof that 779 (7)2"(—1)"~* = 1 for n > 0. (b) Prove the 
identity in (a) using an involution. 

4-27. For integers a > b > 0, evaluate S7j_, (7)a”~*(—b)* by using an involution. 
4-28. Let S C T be given finite sets. (a) Use an involution to prove Ty, scycp(—L!7 4! = 


x(S = T) (cf. Example 4.48). (b) In a similar manner, evaluate )7y. scuer(-DIU-4I. 


4-29. Given d,e € Z5o with dle, use an involution to prove ) 7). giyje H(e/k) = x(d = e). 
Interpret this result in terms of the Mobius function of a poset. 

4-30. Count the nx n matrices A with entries in {0, 1,2} such that: (a) no row of A contains 
all zeroes; (b) every column of A contains at least one zero; (c) there is no index j with 
A(i,j) > 0 and A(j,2) > 0 for all 4. 

4-31. An arrowless vertex in a simple digraph D is a vertex with indegree and outdegree 
zero. How many simple digraphs with vertex set {1,2,...,n} have no arrowless vertices? 
4-32. An isolated vertex in a simple digraph D is a vertex v such that there is no edge (u, v) 
or (v,u) in D with u 4 v. How many simple digraphs with vertex set {1,2,...,n} have no 
isolated vertices? 

4-33. How many simple graphs with vertex set {1,2,...,n} have no isolated vertices? 
4-34. (a) How many anagrams in R(1°2°---n3) never have three equal letters in a row? 
(b) How many anagrams in R(1*2*.--n*) never have k equal letters in a row? 

4-35. (a) Count the permutations w of {1,2,...,n} such that wi; A wi +1 for all 7 in the 
range 1 <i <n. (b) Express your answer to (a) in terms of the derangement numbers dx. 
4-36. Given sequences 0 < ay < ag <+--< apy < Aand0 <b) < bg <--- < be < B, use 
inclusion-exclusion to derive a formula for the number of lattice paths from (0,0) to (A, B) 
that avoid all of the points (a;,b;) for 1 <i<k. 

4-37. Recursion for Mébius Functions. (a) Show that the Mobius function of a poset 
(X, <) can be computed recursively via p(@,z) = — D0). p<ycz H(@, y) for x < z, with initial 
conditions p(v,xv) = 1 and p(x, z) = 0 whenever x Z z. (b) Show that the Mébius function 
also satisfies the recursion p(x, z) = — doy. ncy<z M(Y, 2) for @ < z. 

4-38. Poset Associated to a DAG. Suppose G = (X, R) is a DAG. Prove that there 
exists a unique smallest irreflexive, transitive relation < that contains R. The corresponding 
poset (X,<) is called the poset associated to the DAG G. 

4-39. Let (X,<) be the poset associated to the DAG 


({a, b,c, d, e}, {(a, b), (b, e), (a, c), (c, e), (a, d), (d, e)}). 


Compute the Mébius function jzx in two ways, by: (a) inverting the matrix Z; (b) enumer- 
ating signed chains in (X, <). 
4-40. Let (X,<) be the poset associated to the DAG 


({a,b,c,d,e, f}, {(a, 6), (a,c), (6, d), (0, €); (c,d), (d, f), (es f)}). 


Compute the Mébius function jzx in two ways, by: (a) inverting the matrix Z; (b) enumer- 
ating signed chains in (X, <). 
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4-41. A subposet of a poset (X,<) is a poset (Y,<’), where Y is a subset of X, and for 
a,b€Y,a<' biffa < b. An interval in X is a subposet of the form |x, z] = {ye X:a< 
y < z}. Show that for all a,b,c,d € X, if the intervals [a,b] and [c, d] are isomorphic posets, 
then wx (a,b) = ux(c, d). 

4-42. Assume that X, and X9 are finite disjoint sets. The disjoint union of the posets 
(X1, <1) and (Xo, <2) is (X,<) where X¥ = X, U X2 and for a,b € X,a < biffabe X1 
and a <, b, or a,b € X2q and a <2 b. Determine jx in terms of fix, and jux,. 

4-43. Given a poset (X,<), define a new poset (Y, <’) by setting Y = X U{0} (where 0 isa 
new symbol not in X), and letting <’ be the extension of < such that 0 <' y for ally EY. 
Informally, (Y, <’) is obtained from (X,<) by adjoining a new least element. Determine py 
in terms of [ux. 

4-44. Given posets (X1, <1) and (X2, <2) where X; and X2 are finite disjoint sets, define a 
new poset (X,<) by setting X = X,U X2 and, for a,b€ X,a< biffa,be X; anda <; b, 
or a,b € X2 and a <q b, or a€ X; and b € Xo. Informally, (X,<) is obtained from X1 and 
X2 by making everything in X, less than everything in X2. Determine jzx in terms of px, 
and wx,. 

4-45. Prove Theorem 4.58 by counting signed chains in the product poset X. 

4-46. Given events $),...,5, in a sample space X, find and prove an inclusion-exclusion 
formula for P(S; U---US)). 

4-47. Let S1,..., 5, be independent events in a sample space X (see Definition 1.66). Prove 
that for 1 <i<n, the events $1, S9,...,X—Sj,..., 5, are independent. 

4-48. Let S;,...,5;, be independent events in a sample space X, with P(.S;) = p; for each 
i. Find the probability that none of the events $; occurs first by using inclusion-exclusion, 
and then by iterating the previous exercise. Show algebraically that the two answers agree. 


4-49. Use an involution to prove that for all i,n € Zso, ieo(—-1)*(") eae = x(t = 0). 


4-50. Use an involution to prove that for 0 < k <n, Y7_,(-1)*-*(”) (j)2"7* = (7). 
4-51. Prove that for all n,j > 0, ni = D4_9(—1)7-kIS(g, kb) (OTR). 

4-52. For n > 0, evaluate )77-5(—1)*(”)(n — k)”. 

4-53. Use an involution to prove the following identity satisfied by Catalan numbers: 

Ch = irene Daa, 

4-54. Let A be an n x n matrix with A(i,j) = eae) for 1 < i,j <n. Find and prove a 
formula for A~!. 

4-55. How many bijections f : {1,2,...,n} > {1,2,...,n} are such that the functional 
digraph of f contains no cycle of length k? 

4-56. How many anagrams in R(a?b°c3d?) never have two consecutive equal letters? 
4-57. Prove or disprove: for every integer y > 1, there exist only finitely many integers 
x >1 with d(x) = y. 

4-58. How many compositions of n have k parts each of size at most m? 


4-59. Call a function f : X — Y doubly surjective iff for all y € Y, there exist at least two 
x € X with f(x) = y. Count the number of doubly surjective functions from an m-element 
set to an n-element set, where m > 2n. What is the answer when m = 11 and n = 4? 
4-60. How many integers between 1 and 2311 are divisible by exactly two of the primes in 
{2,3,5, 7}? 

4-61. Let (F;,) be the Fibonacci sequence (Fo = 0, Fy = 1, Fy = Fn-1 + Fn—2 for n > 2). 
Find a formula for )77_9(—1)*F;, and prove it algebraically or by using an involution. 
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4-62. Find and prove a formula for }7y_9(—1)* Fk Fr—k- 

4-63. Prove F,_1Fy41 — F? = (—1)” using an involution on domino tilings. 

4-64, Prove: fori<n<i+m, FinFn — FinsiFo—i = (-1)" Pink. 

4-65. For each integer x > 1, evaluate )77_, u(k)|2/k]. 

4-66. Let Surj(n,&) be the number of surjections from an n-element set onto a k-element 
set. For n > 0, evaluate 77, (—1)* Surj(n, k). 

4-67. For n > 0, evaluate 37771 (—1)*(k — 1)!8(n, k). 


4-68. Given elements a,,b;, in R (or in any commutative ring), prove: 


n n n 
[[ (ex +%) = S- [Le [[%. 
k=1 IC{1,2,....n}i=l j=l 
jG! 
4-69. Counting Proof of Inclusion-Exclusion. Let 5),...,5;, be given subsets of a 
finite set X. Let A be a matrix with rows indexed by elements x € X and columns indexed 
by nonempty subsets I of [n] = {1,2,...,n}. Define the entry of A in row x and column 


I to be (—1)!4!"1y(a € (<7 $i). Prove the Inclusion-Exclusion Formula for |Uj_, S| by 
computing the sum of all entries of A in two ways. Discuss how this proof keeps track of the 
number of times each x € X is included and excluded in the Inclusion-Exclusion Formula. 
4-70. Let [,, be the number of involutions on an n-element set. (a) Find I, for 1 <n <5. 
(b) Find a recursion for computing J, from [,-1 and [,,-2. (c) Find a summation formula 
for In. 
4-71. How many 10-letter words using the alphabet {A,B,...,Z} contain exactly three of 
the five vowels A, E, I, O, U? (Vowels that are used may appear more than once.) 
4-72. How many 13-card bridge hands contain exactly 10 of the 13 possible card values (2 
through ace)? 
4-73. How many numbers between 1 and 1,000,000 are divisible by at least three of the 
primes in the set {2,3,5,7,11,13}? 
4-74. How many words in R(172?---n) have at least k pairs of adjacent letters that are 
equal? 
4-75. How many functions from an m-element set to an n-element set have image of size 
k? 
4-76. Prove the generalized inclusion-exclusion formula (4.7) using Lemma 4.27(b). 

k k i(k 
4-77. Prove: for j < k, ¥)9_;(—1)?(;) = (-1)7( HE 


P gal 
4-78. Prove (4.7) by summing appropriate instances of the formula (4.5). 
4-79. Generalize the involution-based proof given in 4.26 to prove (4.5). 
4-80. Combinatorial Interpretation for Coefficients of Chromatic Polynomials. 
Given a simple graph G with vertex set V and edge set FE, a vertez-spanning subgraph of 
G is a graph H with vertex set V and edge set E’ C E. Let n(G,e,c) be the number 
of vertex-spanning subgraphs of G with e edges and c connected components. Prove the 
following formula for the chromatic polynomial of G: 


xa(z) = 5° (-1)*n(G,e,c)2°. 


e,c>0 


4-81. Use the previous exercise to compute the chromatic polynomial of the graph with 
vertex set {1,2,3,4} and edge set {{1,2}, {1,3}, {1,4}, {2, 3}}. 
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4-82. (a) What is the chromatic polynomial for the 4-cycle C4? (b) For each coefficient 
of this chromatic polynomial, draw the vertex-spanning subgraphs of Cy, counted by that 
coefficient. 

4-83. Show that if G is a simple graph with c connected components, then the chromatic 
polynomial yg(a) must be divisible by x°. 

4-84. For a poset (X,<) with X = {a,...,v,}, define n x n matrices Z and M by setting 
Zij = x(@i < @;) and letting Mj,; be the sum of the signs of all chains in X from 2; to 2;. 
Use an involution to prove the matrix identity 7M = I. 

4-85. Use an involution to prove the Surjection Rule 4.10. 

4-86. Use an involution to prove the Summation Formula for Derangements 4.16. 

4-87. Consider an n x n lower-triangular matrix A such that A(n,k) is the number of Dyck 
paths ending with exactly k east steps, for 1 < k <n. Find a combinatorial description of 
A7!, and prove that this is the inverse of A using an involution. 

4-88. Garsia—Milne Involution Principle. Suppose J and J are involutions defined on 
finite signed sets X and Y, respectively. Suppose f : X — Y is a sign-preserving bijection, 
ie., sgn(f(x)) = sgn(ax) for all « € X. Suppose also that every object in Fix(/) and Fix(J) 
has positive sign. Construct an explicit bijection g : Fix(1) > Fix(J). 

4-89. Bijective Subtraction. Suppose A, B, and C are finite, pairwise disjoint sets and 
f:AUB— AUC isa given bijection. Construct an explicit bijection g : BC. 

4-90. Bijective Division by Two. Suppose A and B are finite sets. Given a bijection 
f : {0,1} x A > {0,1} x B, can you use f to construct an explicit bijection g: A — B? 
4-91. In 84.6 we proved combinatorially that 5°, s(t, k)S(k,7) = x(t = 7). Can you find a 
combinatorial proof that 5°, S(i,k)s(k, 7) = x(t = 7)? (Compare to Theorem 2.64(d).) 
4-92. Find a bijective proof of the derangement recursion d, = ndy—1 + (—1)”. (For a 
solution, see [105].) 

4-93. Let X,, be the set of set partitions of {1,2,...,n}. Define the refinement ordering 
on X,, by setting, for P,Q © X,, P x Q iff every block S € P is contained in some block 
T € Q. (a) Show that (X;,, <) is a poset. (b) Compute the Mobius function of this poset for 
1<n< 4. (c) Show that any interval [P,Q] = {Re X,:P x= RX Q} in X,, is isomorphic 
to a poset (X;,<) for some k. (d) Compute x, for all n. 

4-94. Prove: for n > 1, >7_,(—1)*-1(%) Fy = Fn, where F,, denotes a Fibonacci number. 
4-95. Prove: for 0 <<m<n, sn 8 aes te = (-1)™Fi_m. 

4-96. Prove: for n > 1, 74", (-1)* (77) 2* 1, = 0. 

4-97. Prove: for n > 1, ope (—1)* (2) Foe = (-1)"F,. 

4-98. Suppose n > 2,0 <m <n, and n is even. Prove 7p_,(-1)*(,2",)k™ =0. 


nt+k 
4-99. Prove: for m,n > 0, )729(—1)*(™)(,,",,) = (-1)"(™). 


2n—k n 
4-100. Prove: for n > 0, yin? ye 1. 


4-101. Prove: for n > 0, ye) ye i") aoe, 


n 


yee) = 


4-102. Prove: for n > 0, We 9(—1)*( 
4-103. Prove: for n > 0, 372"5(—1)*(3") (@™-*)” = (3). 
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Notes 


A thorough treatment of posets from the combinatorial viewpoint appears in Chapter 3 
of [121]. See [113] for one of the seminal papers on Mobius inversion in combinatorics. A 
classic text on posets is the book by Birkhoff [11]. The Garsia-Milne Involution Principle 
(Exercise 4-88) was introduced in [43, 44]. For applications and extensions of this principle, 
the reader may consult [50, 68, 82, 83, 103, 104, 133]. An application of Bijective Subtraction 
(Exercise 4-89) is presented in [80]. 
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Generating Functions 


This chapter introduces generating functions, which are powerful tools for solving many 
combinatorial problems. Intuitively, a generating function is an infinite series pa Anz” 
whose coefficients a, count a family of combinatorial objects. We can obtain a great deal 
of information about the numbers a, by performing algebraic or analytic operations on 
the associated generating functions. For example, generating functions can be used to find 
closed formulas for recursively defined sequences a,,. They also give an automatic method 
for evaluating a summation )>;_) ax. Generating functions often deliver quick answers to 
otherwise intractable counting questions. 

This chapter focuses on manipulating generating functions to obtain explicit solutions to 
concrete problems. Later, in Chapter 11, we look more closely at more theoretical aspects 
of generating functions. Some concepts used informally in this chapter will be justified 
rigorously at that time. 


5.1 What is a Generating Function? 


Generating functions can be defined in three different ways: combinatorially, analytically, or 
algebraically. According to the combinatorial viewpoint, a generating function is something 
that lets us count a set of weighted objects that might be infinite. More precisely, we define 
a weighted set to be a set S together with a function wt : S — Zo. For each object u € S, 
wt(u) is a nonnegative integer called the weight of u. As part of the definition, we require 
that for each n > 0, the set A, = {u € S: wt(u) = n} is finite. The generating function of 
the weighted set S' is defined to be the formal sum 


GF(S; z) = a ee 
ues 


where z is a formal variable used to keep track of the weights of the objects in S. In the 
sum appearing here, a given power 2” will appear once for each object in S having weight 
n. Collecting together all of these powers, the generating function for S can be rewritten 


GF(S; z) = oe Qnz", where ay = |An| = {ue S: wt(u) =n}. (5.1) 
n=0 
Momentarily, we will explain precisely what is meant by this sum when S is infinite. 
5.1. Example. Let S be the set of all finite sequences of 0’s and 1’s. For a sequence u € S, 
let wt(u) be the length of wu. For example, wt(10110) = 5. In this case, the set A, of objects 
in S having weight n is {0,1}”. By the Word Rule, |A,| = 2". Therefore, the generating 
function for the weighted set S is 


GF(S;z) = y 2hZ”. 


n=0 
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Next we discuss the analytic definition of generating functions. According to this view- 
point, a generating function is an actual function G : D — C given by a power series 
G(z) = opp an2”, where ao, @1,...,@n,---. is a fixed sequence of complex numbers, D is 
a neighborhood of 0 in C, and z is a complex variable. A key technical point is that the 
infinite series of complex numbers defining G(z) is required to converge for each z in the 
domain D. This means that for each z € D, there exists a complex number G(z) with 


Each limit must be a complex number, not -too. By the theory of power series (reviewed 
in more detail below), for each sequence (a,,) there exists R € [0, co], called the radius of 
convergence of the power series, such that }77° 5 @nz” converges for all z with |z| < R, and 
9 Gn2” does not converge for all z with |z| > R. When R > 0, all the coefficients a, are 
uniquely determined by the associated function G(z) = )>7°.)@nz2" via Taylor’s formula 
ay = Gl) (0)/n!. When R = 0, the series 7°, a,2" converges only at z = 0, so the 
resulting function of z has domain {0}. This domain is not a neighborhood of 0, so we do 
not get an analytic generating function in this case. The function’s only value is G(0) = ao, 
and we learn nothing from this function about the coefficients a, with n > 0. 


5.2. Example. Our previous example led us to the generating function G(z) = )77°_) 2”2”. 


Viewing G analytically as a function of the complex variable z, we can use the Geometric 


Series Formula 2.2 to write - 


G(2) = S722)" = 


n=0 


1 
1-22 


The series converges to the indicated sum for all z € C satisfying |2z| < 1, or |z| < 1/2. 
The radius of convergence of this power series is 1/2. 


One powerful feature of the analytic view of generating functions is that it allows us 
to obtain combinatorial information about weighted sets by manipulating functions of z 
(using techniques such as geometric series, partial fraction decompositions, the Quadratic 
Formula, and the Extended Binomial Theorem discussed below). The next example shows 
how the Binomial Theorem can be used to simplify a generating function. 


5.3. Example. Fix a positive integer m, and let T be the set of all subsets of {1,2,...,m}. 
For an object B € T, let wt(B) = |B| be the number of elements in the subset B. By 
the Subset Rule, we know there are (™) objects in T having weight n. So the generating 


function for T is es 
H(@) = GRTia) = Yo a = (Na 
BeT n=0 1 


The function H is given as a power series with only finitely many nonzero coefficients. In 
other words, H is a polynomial in z. Thus, the power series has infinite radius of convergence, 
and H : C > C is a polynomial function defined on all of C. This happens whenever we 
use generating functions to enumerate a finite weighted set T, since {u € T : wt(u) = n} 
has size 0 for all n exceeding the maximum weight of any object in T. For the particular 
polynomial H considered here, the Binomial Theorem shows that the series defining H can 
be simplified to H(z) = (1+ z)™. 


5.4. Example. For each n > 0, let S,, be the set of permutations of {1,2,...,n}. Let 
S =U, Sn, and let the weight of a word u € S be the length of that word. By the 
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Permutation Rule, we have n! = |S;,| = |{u € S : wt(u) = n}|, so the generating function 
for S' is at 
GF(S;z) = > zwilu) — ‘S niz”. 
ues n=0 


Unfortunately, this power series has radius of convergence zero (as can be checked using the 
Ratio Test, which we review below). So we cannot obtain combinatorial information about 
factorials using analytic power series. 


The last example reveals the main drawback of using analytically defined generating 
functions: when the combinatorial coefficients a, = |{u € S : wt(u) = n}| grow too quickly, 
the associated power series only converges at z = 0. This problem can be addressed in 
two ways. One way is to use a modified version of the combinatorial generating function, 
called an exponential generating function, defined by EGF(S; z) = 07-4 (an/n!)z". In the 
previous example, we find that EGF(S;z) = 07°.) 2" = 1/(1— z), which converges for 
all z with |z| < 1. Generating functions as defined in (5.1) are sometimes called ordinary 
generating functions (OGFs) to distinguish them from exponential generating functions 
(EGF). We discuss EGFs later in this chapter after developing the theory of OGFs. 

However, even using EGFs, there are examples of weighted sets where the associated 
power series have radius of convergence equal to zero. The second way around this problem is 
to use the algebraic definition of generating functions. Algebraically, a generating function is 
nothing more than an infinite sequence A = (a, : n € Zs), where each a,, is a complex num- 
ber (or more generally, an element of some commutative ring R). The power series notation 
A= A(z) = OP.9 Gnz” is still used as a way to present the sequence (a,,), and we still call 
Gy “the coefficient of z” in A(z).” The crucial point here is that z is no longer a complex vari- 
able, and Ys d,z" no longer represents an infinite series of complex numbers. The symbol 
z is nothing more than a formal variable, a placeholder that lets us write symbolic expres- 
sions displaying all the coefficients a,,. To emphasize this viewpoint, we call A(z) a formal 
power series in z with coefficients in R. When working with a formal power series, the issue 
of the convergence of the series at particular values of z does not arise, since the series is not 
even a function of z. Returning to some previous examples, oa 2"z” is just notation for 
the sequence (1, 2,4,8,.0.42",+2.)) Dy 4” 1s notation for (1am, (% \4a1-51),0;0,.2.), 
and )7*°_,n!z” denotes the sequence (1, 1,2,6,24,...,n!,...). 

Why use formal power series notation to discuss sequences? The answer is that analytic 
operations on actual power series (such as addition, multiplication, and differentiation) 
suggest algebraic definitions of analogous formal operations on formal power series. For 
example, the product of two formal power series (discussed below) has a very strange looking 
definition using sequence notation. Using formal power series notation instead, the definition 
is seen to be a natural extension of the familiar rule for multiplying two polynomials. 

Our goal in this chapter is to apply generating functions to gain combinatorial informa- 
tion about weighted sets. In order to extract this information, we need to have some facility 
manipulating generating functions either analytically or algebraically. It turns out that most 
of these manipulations can be justified formally, using only the algebraic properties of op- 
erations on sequences. But the analytic versions of these manipulations are probably more 
familiar to most readers based on previous exposure to calculus. So we focus initially on 
computations with analytic power series, referring to other texts for proofs of convergence 
tests and other needed facts from analysis. This approach lets us get to combinatorial cal- 
culations more rapidly, without getting bogged down in a mass of technical details. Later, 
we revisit various manipulations on generating functions, showing how these calculations 
can be justified algebraically at the level of formal power series. 
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5.2 Convergence of Power Series 


The next three sections review some facts from calculus about analytic power series. We 
begin with some tests for the convergence of series of complex numbers. Given a series 
yo Cn with all c, € C, we say the series converges absolutely iff the real series ser \cn| 
converges. If a series converges absolutely, then the series converges, but the converse does 
not always hold. For example, the alternating harmonic series )+°~_,(—1)"/n converges but 
does not converge absolutely, since the harmonic series }>>~_, 1/n diverges. The Ratio Test 
and the Root Test can be used to detect the absolute convergence of a given series. 


5.5. The Ratio Test. Suppose pahealr Cn is a series of complex numbers such that D = 
limn—oo |Cn41/Cn| exists in [0,00]. If L < 1, then the series converges absolutely. If L > 1, 
then the series diverges. 


5.6. The Root Test. Suppose Mee Cn is a series of complex numbers such that D = 
limn+oo */|cn| exists in [0, co]. If L < 1, then the series converges absolutely. If L > 1, then 
the series diverges. 


We have already mentioned that each complex power series has a radius of convergence 
R € [0,00] such that the series converges at all points inside the circle {z € C: |z| = R} 
and diverges at all points outside this circle. The next theorem formally states this fact and 
gives formulas for R based on the Ratio Test and the Root Test. These formulas use the 
conventions 1/0 = oo and 1/oo = 0. 


5.7. Theorem: Radius of Convergence of Power Series. (a) For every sequence of 
complex numbers (a, :n > 0), there exists R € [0,00] such that for all z € C, 


lee) co 
lzZ|<R=> >» Anz" converges absolutely; lz) >R=> » Anz" diverges. 


n=0 n=0 
(b) The radius of convergence R is given by 


1 


limn—+so0 Vv |an| 


R= lim 


n—-+>co 


and R= 


An+1 
when these limits exist in [0, co]. 


Proof. We prove this theorem in the case where one of the limits mentioned in (b) exists, 
by applying the Ratio Test or the Root Test. Fix z € C, and let c, = a,z” for n > 0. On 
one hand, if the limit Ry = limp. |an/an41] exists in [0, oo], then 


Cn+1 
Cn 


«., tee 
= lim 
noo 


lim 
noo 


= Ry. 

Anz” | uaa 

By the Ratio Test, if |z| < R1, then |z|/R; < 1 and the power series converges absolutely; 
if |z| > Ri, then |z|/R, > 1 and the power series diverges. Thus (a) holds with R = Ry. On 
the other hand, if the limit Ro = 1/limn+. */|an| exists in [0, oo], then 


lim VJen| m V/|anz”| = |z|/Ro. 


= li 
n+ Co noo 


Now the Root Test shows that the power series converges absolutely for |z| < Rg and 
diverges for |z| > Ro, so that (a) holds with R = Ro. Oo 
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In fact, it can be shown that the Root Test is always strong enough to prove existence of 

the radius of convergence, if we replace the second limit in (b) by R = 1/limsup,,_,., V/|an|. 
For a proof, see Theorem 3.39 in [115]. However, it is often easier to find the radius of 
convergence using the Ratio Test. We also remark that none of the theorems mentioned so 
far gives any information about the convergence or divergence of the power series on the 
circle of convergence {z € C: |z| = R}. 
5.8. Example. Find the radius of convergence of these power series: (a) )>7°_,6"z", 
where b € C is fixed; (b) (>). nz”; (c) WO, (2"/nl)z"; (d) Wg ((-1)*/kN) 22H. 
We can solve (a) using either formula for R in Theorem 5.7. The formula based on the 
Ratio Test gives R = limps. |b"/b"*'| = 1/|b|. The formula based on the Root Test 
gives R = 1/limp+o V/\b"| = 1/|b|. In contrast, for (b) the Root Test formula gives 
R=1/limps0 Yi[n"| = 1/limp+.n = 0, so this power series converges only at z = 0. 
For (c), we take ratios: 


2” /n! 


~ nti 
ao aes 


= ©. 


R= lim 


n—-co 


For (d), the ratio-based formula for R in Theorem 5.7 technically does not apply, since 
dy, = 0 for all even n. But the original version of the Ratio Test can be used on the series 
of nonzero terms cz = (—1)*z?*+1/k! indexed by k. For any z € C, we find that 


(1) Prt Frs 1 + 1)! 


Ck |z|? 


Ck 


lim 


k-0o 


= ii =0 <1 
is pad 


= lim 
k-oo 


So this series converges for every z € C, and the radius of convergence is R = oo. 


(I 


5.3 Examples of Analytic Power Series 


We continue to review facts from analysis about complex power series. The next theorem 
is a fundamental result of complex analysis that explains the close connection between 
power series and analytic functions of a complex variable. For 0 < R < ov, we use the 
notation D(0;R) = {z € C: |z| < R} for the open disk with center 0 and radius R in the 
complex plane. A function G : D(0; R) > C is called analytic iff G has a complex derivative 
at each point of D(0;R). This means that G’(z) = lim ee) Oe) 
h-+0,hEC h 
z € D(0;R). The notation G(”) (zp) denotes the nth complex derivative of G at zo, when 
this exists. 


exists for all 


5.9. Theorem: Power Series and Analytic Functions. (a) Suppose \>> 9 an2” is a 


complex power series with radius of convergence R > 0. Then the function G : D(0; R) > C 
defined by G(z) = 372.9 anz2” for |z| < R has complex derivatives of all orders, and a, = 
G(0)/n! for all n > 0. 

(b) Suppose R > 0 and G: D(0; R) > C is an analytic function. Then G has derivatives 
of all orders on D(0;R), G(z) = C22, S32” for all z € D(0;R), and the radius of 
convergence of this power series is at least R. 


See Chapter 5 of [23] for a proof of this theorem. In part (a), the formula a, = G™ (0)/n! 
shows that the coefficients a, in the given power series are uniquely determined from the 
analytic function G defined by that power series, provided the radius of convergence is 
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positive. Part (b) lets us manufacture many examples of convergent power series by starting 
with known analytic functions. 


5.10. Example. The complex exponential function G(z) = e* is defined and analytic on 
the entire complex plane. Like the real exponential function, G satisfies G’(z) = e* = G(z), 
so G'™ (z) = e? and G(”) (0) = 1 for all n > 0. By part (b) of the previous theorem, we have 
the power series expansion 


f= Daitzt Aas Ajaig t/a, 


which is valid for all complex z (i.e., R = 00). Similarly, by taking derivatives repeatedly, 
we obtain the following expansions: 


 (-1)F 

sinz = ae 2° /3!+2°/l—27/7+--- for all z € C; 
=0 

cosz = SDE ak 1 2/4 Ala — 29/64 for all z € C; 

7 : (2k)! 7 

c=0 
CO 2k+1 

sinh z = > pap 72+ /! t 2°/5! } 2" /7! forse for all z EC; 
k=0 
ee ek 

coshz = >, OE =1427/2!4 24/4! +4 25/6l4+.--- for all z EC. 
k=0 


Here the hyperbolic trigonometric functions are defined by sinhz = (e* — e~*)/2 and 
cosh z = (e* +e *)/2. 


5.11. Example. In complex analysis, there is a version of the natural logarithm function, 
denoted Log, which is defined and analytic on the domain C—{z € R: z < 0}. The derivative 
of Log z is 1/z on this domain. It follows that G(z) = Log(1+ z) is defined and analytic 
on the disk D(0; 1), with G’(z) = (1+ z)7!, G’(z) = -—(1+z)~?, G"(z) = 2(1+ 2)73, and 
in general G((z) = (—1)""!(n — 1)\(1 + z)~” for all n > 1. Setting z = 0 gives G(0) = 0 
and G0) = (—1)"~!(n — 1)! for n > 0. By Theorem 5.9 on Power Series and Analytic 
Functions, we obtain the power series expansion 
(-1)""1 


Log(1 +z) =) 2" =2- 27/24. 8/38 —-- for |z| <1. 
n=1 


Noting that |z| < 1 iff | — z| < 1, we can replace z by —z to obtain 


—Log(1 — 2) = 0 — =2427/24+28/3+--- for |z| <1. 
n=1 


For nonzero b € C, we have already seen (§2.2) the Geometric Series expansion 


1 CO 
1—b lhe = 7 Te __ 242 4 By3 4... : 
( z) ae pe z 14+ bz+b°2°+bd°z for |z| < 1/|b| 
For a positive integer m, we have also seen (§2.3) the Binomial Theorem 


(l+z)"= ss (™) ie for z €C. 


n=0 
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We will make constant use of these expansions as well as a generalization called the Extended 
Binomial Theorem, which we discuss now. 

Given any complex constant 7, there is an analytic function G(z) = (1+ 2)" with 
domain D(0;1) given by G(z) = e™'°8(+*) for |z| < 1. When r € Z, it can be shown 
that this definition coincides with the algebraic definition of integer powers; specifically, for 
r € Zso, G(z) is the product of r copies of (1+ 2), and when r € Zeo, G(z) is the product 
of |r| copies of the multiplicative inverse of 1+ z. Using properties of complex exponentials 
and logarithms, it can also be shown that (1 + z)"T* = (1+ z)"(1 + z)§ for all r,s € C, 
and G’(z) = r(1+2z)"—!. More generally, for any n > 0, G)(z) = (r)ln (1 + z)"~” where 
(r)lo= 1 and (r)ln= r(r — 1)(r — 2)---(r —n +1) for n > 0. Applying Theorem 5.9 on 
Power Series and Analytic Functions, we obtain the following expansion. 


5.12. The Extended Binomial Theorem. Given any r € C, 


ey = So Oe igre Mo, we) sy... for |z| <1. 
=. ie 2 3! 

By taking r to be a positive integer m, we recover the original Binomial Theorem, since 
(m){n= 0 for all n > m, whereas (m){n /n! = (™) for 0 <n < m. Taking r = —1 and 
replacing z by —bz, we recover the Geometric Series formula, since (—1)/, /n! = (—1)" and 
| — bz| < 1 iff |z| < 1/|)J. 

Another frequently needed special case of the Extended Binomial Theorem is obtained 
by taking r to be a negative integer —m and replacing z by —z. Noting that 


(—m)ln (—2)"/n! = m(m + 1)--- (m+ n—1)2"/n!l = a! a, 


we deduce the following expansion. 


5.13. The Negative Binomial Theorem. Given any integer m > 0, 


(1—z)-™ = Yo-yo on => (“rr ~ ") " for |z| <1. 


n,m—1 
n=0 n=0 


For example, given any z € D(0;1), 


Gas. p31 a Soe ee eae 
n=0 n=0 
[oe] +1 [o-e) 
(1—2z) = ee Je = Dont et = 14 Bet 82 ped be 
n=0 n=0 
3 (n+ 2 n 2 3 4 
(Lazy 9 = DU( 0g Jet = 14824 Ge? +1029 + 1524 +-- ; 
n=0 
—4 ~ n+3 n 2 3 4 
(L- zy = SU (O 3g je = 1+ 42 + 102? + 2029 + 8524 + --- 
n=0 


Replacing z by bz, where b € C is nonzero, we also see: 


-1 
the coefficient of z” in the power series for (1 — bz)~™ is b” (" +m ) 


n,m—1 
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5.4 Operations on Power Series 


Given analytic functions F' and G defined on some disk D(0; R), we can form new analytic 
functions by performing operations such as addition, multiplication, and differentiation. 
The next theorem gives formulas for the coefficients in the power series representations of 
these new functions. 


5.14. Theorem: Operations on Analytic Power Series. Suppose F(z) = 07-9 an2” 
and G(z) = 7°. bnz” are analytic functions on D(0; R), where R > 0. The following rules 
hold for all z € D(0; R). 

(a) Equality Rule. F = G on D(0; R) iff an = by for all n > 0. 


(b)Sum Rule. F(z) + G(z) = eC + by)2” 


n=0 
(c) Scalar Multiple Rule. For c € C, cF(z) = > Cap)" 
n=0 


(d) Power-Shifting Rule. For k € Zyo, 2° F(z) = S- On — he" 


R=k 
(e)Product Rule. F(z)G => » Anon — ) 
a 0 
(f) Derivative Rule. F"(z aS NOyZ” = yo + L)an4i2”. 


n=1 


(g) Shifted Derivative Rule. zF"(z > NOyZ 


Proof. Each part is proved by combining known derivative rules with Theorem 5.9 on Power 
Series and Analytic Functions. We prove (b), (e), and (f) as illustrations. For (b), the 
Sum Rule for Derivatives shows that S(z) = F(z) + G(z) is analytic on D(0;R) with 
S'(z) = F'(z) + G'(z). More generally, S()(z) = F™(z) + G™ (z) for all n > 0. We know 
there is a power series expansion S(z) = )>7°_9 ¢nz” valid for z € D(0; R), and moreover 


S80) F™0) G0) 
= + 
n) n! n) 
For (e), define P(z) = F(z)G(z). Repeatedly using the Product Rule for Derivatives, 
we find that P’(z) = F’(z)G(z) + F(z)G'(z), hence P” = F"G + 2F’G' + FG", then 
PU! = F'"G+3F"G' +3F’G" + FG", and so on. By induction on n, it can be proved that 


Cr = =adnt+b, foralln>0. 


n 


PO (2) =~ (") FE) (z)G@-*) (z) (5.2) 


k=0 


(note the analogy to the Binomial Theorem). By Theorem 5.9(b), there is a power series 
expansion P(z) = >>°_9 dnz” valid for z € D(0;R), where 


Pg nF) (9) G(r- k)(0 
ae a> oe = Saab x foralln>0. 
k=0 


We remark that if F' and G are finite series (i.e., polynomials in z), then (e) can be proved 
algebraically using the Generalized Distributive Law. 
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Finally, we prove (f), which says that the power series representation of an analytic 
function can be differentiated term by term. We know F” is analytic on D(0; R) since F’ 
has derivatives of all orders. So there is an expansion F’(z) = )77°_) enz” for z € D(0; R), 
where ies Goxls 

ge eas for alln > 0. O 
n! (n+ 1)! 

The preceding theorem motivates the following definitions of algebraic operations on 
formal power series. Recall that a formal power series A = A(z) = 07°.) dnz” is defined to 
be the infinite sequence of coefficients (a, : n > 0). The coefficients a, can belong to any 
fixed commutative ring R with identity 1p, although we usually take R = C. 


Co 


5.15. Definitions: Operations on Formal Power Series. Let A(z) = 07) anz" and 


B(z) = P29 bn2” be formal power series. 
(a) Equality. A(z) = B(z) iff an = bp for all n > 0. 
(b) Addition. A(z) + B(z) = S—(an + bn)2”. 


n=0 
(c) Scalar Multiples. For c € R, cA(z) = > (can)2”. 
n=0 


(d) Power-Shifting Rule. For k € Zo, z* A(z) = ys On—K2”- 
n=k 


(e) Multiplication. A(z)B(z) = » (>: anh) vie 


n=0 \k=0 
oo Co 
(f) Formal Differentiation. A'(z) = x. ie = Son + L)an4iz”. 
n=1 n=0 


Now that operations on formal power series have been defined, it is necessary to prove 
algebraic identities and derivative rules similar to rules already known for analytic functions. 
For example, there are formal versions of the associative laws, the commutative laws, the 
distributive laws, the Sum Rule for Derivatives, the Product Rule for Derivatives, and Tay- 
lor’s Formula. We leave most of these verifications as exercises, but we prove the distributive 
law to illustrate the method. 

Fix formal power series A = )7r° ,an2", B = OPg bn2™, and C = Vy cn2"; we 
prove A(B+C) = AB+ AC. On the left side, B+ C = ye 5s +¢y)z", so the coefficient 
of z” in A(B+C) is yy 9 @k(On—k + Cn—k) = Dop—o(Akbn—k + GkCn—z). On the right side, 
AB = >> 6.9 debn—a)e” and AC = }°/ 4()) 9 kn) 2". 50 the cocficient of 2” 
in AB + AC is Syy 9 Qkbn—k + op GkCn—k- This equals the coefficient of 2” in A(B +C) 
for each n > 0. So A(B + C) = AB + AC follows from the definition of equality of formal 
power series. 


5.5 Solving Recursions with Generating Functions 


We have now developed enough computational machinery to describe our first application 
of generating functions: solving recursions. In §2.16, we briefly discussed a recipe to solve 
homogeneous recursions with constant coefficients. Generating functions provide a power- 
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ful general method for solving these recursions and many others. The following examples 
illustrate the technique. 


5.16. Example. Define a sequence (a, : n > 0) by ap = 2 and a, = 3a,_1 — 1 for all 
n > 1. What is a closed formula for a,? Our approach is to introduce a generating function 
F(z) = 3.9 nz” whose coefficients encode all of the unknown quantities a,. The next 
step is to use the recursion for a,, to deduce an algebraic equation satisfied by this generating 
function. Since the recursion is only valid for n > 1, we first subtract the constant term 
ao = 2 from F(z). We then compute 


co 


F(z) -2=F(z)—aj)= s a2" = S > (384n—1 —1)z"= 9 a, iz” '—2 3 get. 
n=1 


n=1 n=1 n=1 


We can simplify these sums by changing the summation variable to m =n — 1. This leads 
to 
z 


1l-2z 


ieee a 25 ae 


m=0 m=0 


Using algebra to rearrange this equation, we get (1 — 3z)F(z) =2—z/(1-—), or 


2—3z 


Mo= G8) 


This equation gives a generating function solution to the original recursion. To finish, we 
need only determine the coefficient of z” in this generating function. This can be achieved 
using the partial fraction decomposition learned in calculus. (See §11.4 for a detailed devel- 
opment of this technique.) We write 

2—3z B C 


PO)= Gada) 1-2 1-& 63) 


for some unknown constants B and C. Clearing denominators gives 
2-32 = B(1-3z)+ C(1 — 2). 


Setting z = 1 gives -1 = —2B, so B = 1/2. Setting z = 1/3 gives 1 = (2/3)C, so C = 3/2. 
Putting these values back into (5.3) and using the Geometric Series Formula, we get 


3s age” = Fe) =B 3 2-+C 3 og = sr(1/2 +3" /2)2". 
n=0 n=0 n=0 n=0 


We conclude that a, = (3"*1! + 1)/2 for all n > 0. 


Before leaving this example, we must consider a subtle logical point. At the very end of 
the example, we found a, by invoking the Equality Rule in part (a) of Theorem 5.14. But 
this rule can be used only if we already know that the analytic power series an Anz” has 
a positive radius of convergence. One way to resolve this problem is to prove by induction 
that the claimed solution a, = (3"t! + 1)/2 really does satisfy the recursion and initial 
condition. 

Another approach is to note that G(z) = B(1 — z)~1+C(1 — 3z)~! converges for all 
z € D(0;1/3). Reversing the algebraic steps that led to (5.3), we see that G solves the 
generating function identity G(z) — 2 = 3zG(z) — z/(1 — z). In turn, it can be checked 
that this identity implies that the coefficients a}, of G(z) satisfy the original recursion and 
initial condition. A quick induction on n shows that there is exactly one solution to that 


Generating Functions 201 


recursion and initial condition. So the coefficients of G, namely a/, = (3"*! + 1)/2, do solve 
the original problem. 

By using formal power series, we can avoid the awkwardness of doing an extra induction 
argument or checking the reversibility of intricate generating function manipulations. In 
this approach, we start by defining F = (a, :n > 0) to be the unique sequence satisfying 
the given recursion and initial conditions. Then we perform all the algebraic steps above 
(always operating on formal power series) to see that F' is the sequence (a +1: > 0). By 
definition of equality of formal power series, ay = (3"*! + 1)/2 follows. The one catch here 
is that we must already know that formal power series obey all the laws of algebra including 
rules for manipulating fractions. This raises the question of what is meant when we divide 
one formal power series by another. In the present case, it suffices to know that the formal 
geometric series eer b”z” is the multiplicative inverse of the formal power series 1 — bz 
(taking b = 1 and b = 3). We examine multiplicative inverses of formal power series more 
closely later (see §11.3). For now, let us look at more examples of solving recursions. 


5.17. Example. Define ap = 2, a, = 1, and ayn = Gn_2 + 2n for n > 2. What is a closed 
formula for a,,? Proceeding as before, we define a generating function F(z) = 77° 9 anz2” 
whose coefficients are the unknown values a,. To use the recursion, we first subtract the 
first two terms ao + a1z = 2+ z from F(z): 


Co Co Co co 
F(z)-2-z2= >. ang” = Gs +2n)2z” = 2” S- An—22" 2 + 2z x nzrt, 
n=2 n=2 n=2 n=2 


Introducing the new summation variable m = n — 2, the first term on the right side be- 
comes 27 04 @mz™ = z?F(z). The second term looks like a derivative, suggesting the 
calculation: 


22(1—2z)-2-2+2 23 +427 —3z2+2 
1-2? ~  (1—2)8(14+ 2) 
In this case, the partial fraction decomposition looks like 
F(z) —z3 +47? —32+2 A a B & C 4 D 
Z)=- — OCU SO ———“| ——— —_ 
(1 — z)8(1 + z) l-z (1-2)? (l-zj? 142’ 
z? 442" —-32+2=A(1—z)(1+2z)+B(il—z)(l+z2)+C(1+z)+D(—-2z)*. (5.4) 


Setting z = 1 gives 2C' = 2, so C = 1. Setting z = —1 gives 83D = 10, so D = 5/4. Plugging 
these values in and doing a little more algebra, we find that A = 1/4 and B = —1/2. Now, 
by the Negative Binomial Theorem, 


ae = F(z) =AS 4B (nt tey> (2?) n+ care 
n=0 n=0 n=0 n=0 n=0 


Equating coefficients of z”, we finally reach the solution 


or 


Gn, = (1/4) — (1/2)(n +1) + (" 5 *) + (5/4)(—1)" = n7/2+n+3/4+ (5/4)(—1)". 
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It can be checked by induction that this solution does satisfy the original recursion and 
initial conditions. Alternatively, we can reverse all the steps, noting that the tentative 
solution F(z) has radius of convergence 1. Even better, we can solve the whole problem 
using formal power series, assuming that the necessary facts about formal multiplicative 
inverses are available. 


5.18. Example. Let us solve the recursion an = 4an_1 — 4dn_2 + 072” for n > 2, with 
initial conditions a9 = a, = 1. Let (a,,) be the unique solution to this problem, and define 
F(z) = yr. .9 anz™. We compute 


F(z)-l-z = ve tine" = S 7 (4an—1 — dana + 72")2" 
n=2 n=2 


CoO co Co 
4z y Anz" 1 — 42? y On—22 2 + y n2Qr 2”. 
n=2 n=2 n=2 


The first sum is 77°) @mz™ = F(z) — 1, and the second sum is )7>)_) @mz™ = F(z). 

But what is the third sum? This sum resembles the geometric series G(z) = 
rg 22” = (1 — 2z)71, but each summand has an extra n? and the sum starts at n = 2 
not n = 0. We can make this n? appear by repeated use of rule (g) from Theorem 5.14. 
Using rule (g) once, we see that zG’(z) = )>™~_, n2"z”. Taking the derivative of (1 —2z)~! 
and then multiplying by z, we deduce that H(z) = ~>-.)n2"z" = 22(1 — 2z)~?. Now use 
rule (g) again on H, taking another derivative and multiplying by z. We get 


S 5 n?2"z” = 22(1 — 2z)-? + 8z7(1 — 2z) 8. (5.5) 
n=0 


The third sum above starts at n = 2, not n = 0, so we must subtract 0 + 2z from this 
expression. 
Returning to the equation for F(z), we now have 


F(z) —1—z=42(F(z) —1) — 427F(z) + 22(1 — 2z)~? + 827(1 — 2z)~3 — 22. 
This rearranges to 


(1 —42+427)F(z) =1—5z+2z(1 — 2z)~? + 827(1 — 2z)-3. 


Since 1 — 4z + 4z? = (1 — 2z)?, we finally obtain 


F(z) = (1 — 2z)-* — 52(1 — 22)? + 22(1 — 2z)* + 827(1 — 22)-*. 


We can use the Negative Binomial Theorem and the Power-Shifting Rule to get the series 
expansion for each term on the right. For instance, 22(1 — 2z)~4 = 772.9 oats) zg)" = 
ye, (47) 22" (letting n = k +1). The coefficient of z” in F(z) is 


n=1 
n+1 n n+2 n+2 
gr _5 gn-1 gn gn—2 
0 0 ae Ce ee 


_ n* + 4n® + 5n? — 16n + 12 gn 
> 12 , 


an 


For a general theorem regarding generating function solutions to recursions, see §11.5. 
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5.6 Evaluating Summations with Generating Functions 


This section discusses another application of generating functions: evaluating summations. 
Suppose we are given a sequence (ay, :n > 0) and are asked to find the sums s, = 7; ax 
for each n > 0. The product formula for generating functions (Theorem 5.14(e)) allows us 
to find the generating function S(z) for the sequence of sums (8s, : n > 0), as follows. Define 
F(z) =) _o0ae” and Gz) = >) Le” = 1 = 2)". 80. Ge) = Fo bao” where by = 1 
for all n. Now, the rule for multiplying power series gives 


= F(z)G(z) = > (>: ohn) ges > «) 2h = S- Snz” 
n=0 \k=0 n=0 \k=0 n=0 


Thus, 09 ax is the coefficient of z" in the power series S(z) = F(z)/(1—z). We can use 
partial fraction decompositions and similar techniques to find these coefficients, as seen in 
the following examples. 


5.19. Example. Let us use generating functions to evaluate the finite geometric series 

> p~o 3°. Here an = 3" and F(z) = >>), 3"z” = (1 — 3z)~', so the generating function 

for the sums is S(z) = (1 — 3z)~!(1 — z)~!. The partial fraction decomposition of $(z) is 
1 3/2 —1/2 


ol) eaan Taare tae 


Thus the coefficient of z” in S(z) is 


aert = 


ys" = (3/2)3” — (1/2)1" = —, 


Of course, this agrees with the formula for a finite geometric series derived in §2.2. 


5.20. Example. Nee we compute 8, = )>;_, k® using generating functions. Here a, = n° 


and F(z) = >>) n°z". We can evaluate this series using Theorem 5.14(g). Starting with 


(1—z)~t = 3%, 2”, we take a derivative and multiply by z three times in a row. We get: 


z1—z)? = S- nz”: 
n=0 


g(l—g)74+227(1-—2z)° = > nz 
n=0 
2(1 —2)-? + 62701 — 2)? +627(1-—2)-* = » nz 


Multiplying by (1 — z)~! gives the generating function for s,: 
z(1 — z)~3 + 627(1 — z)-* +. 623(1 — z)- = Sm" 


Using the Negative Binomial Theorem and the Power-Shifting Rule, we obtain 


n—-1+2 n—2+4+3 n—3+4 n* + 2n3 +n? n(n +1)]? 
ee ea 


Compare this derivation to the method used in §2.6. 
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5.21. Example. Next we find s, = )7;_) k?2*. In an earlier example (see (5.5)), we found 


z)= » n?2%2” = 22(1 — 2z)~? + 8z7(1 — 2z)-3 
n=0 
Therefore, 
F(z)(1-2z) => Synz" = 2z(1 — 2z)~2(1 — z)* + 827(1 — 2z) F(1 — z)*. 


The partial fraction expansion (found with a computer algebra system) is 
—6 12 —10 4 
F 1—z)7! = —— ——_—. —— a— 
G8) =F Tom * Gobo?” one 
Extracting the coefficient of z”, we get 


” 1 2 
sn = ea = 6 412-2" 10("T Jorsa(") Jo" =? an +3)" —6 


5.7 Generating Function for Derangements 


Recall that a derangement of length n is a permutation w = w1,w2--:Wp with w; 4 7 for 
1<i<_n. Let d, be the number of derangements of length n. This section analyzes the 
exponential generating function D(z) = >) & dn 2” for derangements. 

In 84.5, we saw that the derangement numbers satisfy the recursion 


dyn = ndy-1 +(—1)" for alln>1, 


with initial condition dj = 1. Let us see what happens when we try to solve this recursion 
by the generating function method. 

We start, as before, by subtracting off the initial coefficient of D(z) and then applying 
the recursion: 


dn in Wa Mdn-1 + (-1)"_, ae = 


n=1 " n=1 n=1 


a 


Letting m = n — 1, we see that the first sum here is D(z). The second sum is the power 
series for e~* with the n = 0 term deleted. Therefore, D(z) — 1 = zD(z) +e~* — 1. So 
(1 — z)D(z) = e~*, and we conclude that 


Le FQ) =e" = Cy z”. Recalling the technique of §5.6, we see that the power 
series D(z) is Teta fhe ordinary generating function for the sequence of partial sums of 
the terms a, = (—1)"/n!. In other words, 


nm ” -1 P 
Ll cea ~ feral SO, 
k=0 : 


Thus, generating functions have led us to the Summation Formula for Derangements in §4.5. 
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5.8 Counting Rules for Weighted Sets 


Now that we have some facility with algebraic manipulations of generating functions, it is 
time to reveal the combinatorial significance of the basic operations on generating functions. 
In Chapter 1, we studied three fundamental counting rules: the Sum Rule, the Product 
Rule, and the Bijection Rule. We now state versions of these rules that can be used to 
find ordinary generating functions for weighted sets. Recall that a weighted set consists of 
a set S and a weight function wts : S > Zso such that A, = {u € S: wts(u) = n} 
is finite for all n > 0. The ordinary generating function (OGF) of this weighted set is 
GF(S;z) = Vues gray 4 ala 


5.22. The Sum Rule for Weighted Sets. Suppose S$ is a weighted set that is the disjoint 
union of k weighted sets $1, S2,...,5;. Assume wts,(u) = wts(u) whenever 1 <i < k and 
u € §;. Then 

GF(S; z) = GF(S1; z) + GF(S2; z) +--+: + GF(S;; z). 


5.23. The Product Rule for Weighted Sets. Suppose & is a fixed positive integer 
and S},...,5, are weighted sets. Suppose S is a weighted set such that every u € S' can 
be constructed in exactly one way by choosing ui € 51, choosing ug € Sg, and so on, 
and then assembling the chosen objects u1,...,ug in a prescribed manner. Assume that 
whenever u is constructed from uj,uUg,...,Ug, the weight-additivity condition wtgs(u) = 
wtg, (ui) + wts,(u2) +--+: + wts, (ue) holds. Then 


GF(S; z) = GF(S; z)- GF(S2; z)-...- GF(S,; 2). 


5.24. The Bijection Rule for Weighted Sets. Suppose S$ and T are weighted sets and 
f:S— T is a weight-preserving bijection, meaning that wtr(f(u)) = wtgs(u) for allu € S. 
Then GF(S; z) = GF(T; z). 


We prove these rules in the rest of this section; these proofs may be omitted without 
loss of continuity. Simpler proofs of the special case of finite weighted sets are given in §8.2. 

Step 1. We prove the Sum Rule for Weighted Sets when k = 2. Suppose S is the disjoint 
union of weighted sets A and B. For each n > 0, let An = {x € A: wta(x) =n}, dn = |Anl, 
B, = {y € B: wta(y) = n}, and b, = |B,|. By definition, GF(A;z) = S772.) anz” and 
GF(B;z) = P29 bn2”. Now let C, = {u € S : wtg(u) = n} and cy, = |C,| for n > 0. 
Since S' is the disjoint union of A and B, and since the weight function for S agrees with 
the weight functions for A and B, C,, is the disjoint union of A, and B,. So the Sum Rule 
for unweighted sets tells us that c, = an, +b, for all n > 0. By the formula for the sum of 
two power series, 


GF(S; z) = a CZ = So (an + bn)z” = > Anz” + ». bnz” = GF(A; z) + GF(B; z). 
n=0 n=0 n=0 n=0 


The general case of the Sum Rule follows from Step 1 by induction on k. 

Step 2. We prove the Product Rule for Weighted Sets when k = 2. Suppose every object 
u in the weighted set S can be constructed uniquely by choosing « € A, choosing y € B, and 
assembling x and y in some manner to produce u. Also assume wts(u) = wt4(x) + wtp(y) 
when wu is constructed from x and y. Define An, Bn, Cn, Gn, bn, and cp as in Step 1. Fix 
n € Z>o; we use the Sum Rule and Product Rule for unweighted sets to build objects in Cy. 
The key observation is that every u € C;, must be constructed from a pair (x,y) € Ax B 
where wt 4(”) =k for some k € {0,1,...,n} and wtg(y) = wts(u) — wt4(z) = n—k. For 
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fixed k between 0 and n, we can choose x from the set A, in a, ways, then choose y from 
the set B,—, in by_, ways. The Product Rule gives agb,_—~ pairs (a, y) for this value of k. 
The Sum Rule now shows that c, = |C,| = ys ap0n—k. By the formula for the product 
of two power series, 


GF(S;z) = 2 Caz” = 2 (>: cnn) 2” 
n=0 n=0 \k=0 
= (> on . (> ba") = GF(A; z)- GF(B; z). 
n=0 n=0 


The general case of the Product Rule follows from Step 2 by induction on k. 

Step 3. We prove the Bijection Rule for Weighted Sets. Let f : S — T be a weight- 
preserving bijection with two-sided inverse g : T — S. One checks that g is also a weight- 
preserving bijection. For each n > 0, f and g restrict to bijections f, : S, — T, and 
Gn : Tn + Syn, where S, = {u € S: wts(u) = n} and T, = {vu € T: wtr(v) = n}. By 
the Bijection Rule for unweighted sets, |S,,| = |T;,| for all n > 0. Now, by the criterion for 
equality of power series, 


GE(S;2) = >" |Sale= )_ tale" = GE; 2). 
n=0 n=0 


5.9 Examples of the Product Rule for Weighted Sets 


This section gives examples of counting problems that can be solved using the Product Rule 
for Weighted Sets. 


5.25. Example. Let us count nonnegative integer solutions to the equation x, + v2 + 
z3 +24 = n subject to the restrictions 0 < x, < 3, 3 > 2, and x4 is even. Define 
S to be the set of all 4-tuples (#1, 22,x73,xv4) satisfying the given restrictions, and define 
wt(x1,%2,03,04) = 41 +a2+2%3+24. We seek the coefficient of z” in the generating function 
GF(S; z). 

To find this generating function, note that each object in S arises by choosing x; from 
the set S; = {0,1,2,3}, choosing x2 from the set Sp = {0,1,2,...$, choosing x3 from the 
set S3 = {2,3,4,...}, and choosing x4 from the set S4 = {0,2,4,6,...}. Define wt(a) = a 
for each integer a in any of the sets S;. Then the weight-additivity condition 


wt g(21,%2,%3,04) = wts, (a1) + wtg, (v2) + wtgs, (x3) + wts, (x4) 


is satisfied. By the Product Rule for Weighted Sets, GF(S; z) = ee GF(S;; z). 
To find the generating function for 9;, we use the definition GF(S;;z) = ous, gle). 
Making repeated use of the Geometric Series Formula, we compute: 


GF(S1;z) = 2°+2'+27 +23; 
GF So; 2) _— 7 zt 2? fees (1 = a) 
GF(933z) = 227423 4244---=2?2S 02% = (1-2); 
k=0 
GF(S432z) = 14+2?+244+28+--=)0P¥ = (1-2) b= (1-2) 142)". 


k=0 
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Therefore, 
GF(S;2) = (I+ 2+ 2? 4 29)27(1-—2) 7014 2)7 = (+240 -2)". 


Using a computer algebra system, we find the partial fraction decomposition of this gener- 
ating function to be 


GF(S;z) = -3-— 2+ 2(1— z)73 —6(1—z)-?7+7(1-—z)™. 


Extracting the coefficient of z”, the number of solutions for n > 2 is 


2 1 
ie )-6("} ) +7 anes 


When n = 0 or n = 1, there are no solutions since we required x3 > 2. 


5.26. Example. Suppose there are k varieties of donuts. How many bags of n donuts 
contain an odd number of donuts of each variety? We can model this counting problem 
using the integer equation 7; + x2 +--:-+2,% =n, where each x; must be an odd positive 
integer. Define S be the set of k-tuples (11,..., 7%) where every x; is odd and positive, and let 
wt(@1,...,@,) = a1 +---+a%. Let T = {1,3,5,...} be the set of odd positive integers, with 
wt(a) = a for a € T. We can build each object in S by choosing 2; from T, then x2 from T, 
and so on. By the Product Rule for Weighted Sets, GF(S; z) = fi; GF(T; z) = GF(T; z)*. 
By the Geometric Series Formula, 


2 =2(1— 2"). 


lag: 


GF(T;z)=26' 4+ 242°4---=2 


> 
ll 


0 


Therefore, GF(S; z) = z*(1 — z?)~*. By the Negative Binomial Theorem (with z replaced 


by 27), 
SS (G+k-1\ 5, 
2¥(1 — 22)-* = 2 (’ 2 
ye Zk-1 


The total power of z is n when k + 27 = n, or equivalently 7 = (n — k)/2. So the answer 

: es (n—k)/2+k—-1 

to the counting question is (‘ Ul ‘a 
otherwise. 

On the other hand, we can factor the generating function for S, obtaining GF(S;z) = 

z*(1—z)~-*(1+z)~*. Using the Negative Binomial Theorem twice and the definition of the 


product of generating functions, 


when n — k is even and nonnegative, and zero 


GF(S;z) = ee [seme = 
EG re yee) 


Taking m = n —k gives the power z”. Extracting this coefficient and comparing to our 
earlier answer, we deduce the combinatorial identity 


n—-k 


pe aa « pay ‘) (" aoe ‘) = { (MOR 2TE1)  if (n — k)/2 € Zoo; 


; k-1 k-1 0 otherwise. 
j=0 
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5.27. Example. How many ways can we make a multiset of 35 marbles using red, white, 
blue, green, and yellow marbles, if each color of marble must be used between 4 and 9 
times? We build such a multiset by choosing the number of red marbles, the number of 
white marbles, and so on. The generating function for each individual choice is 


2*(1 — 28) 


5 
Fle) = 24 4 Po 98 oT 4 8 4 = 2 ke 
(z)H=2rterter4¢z2°4 2°42 ey 2 —_ 


k=0 
By the Product Rule for Weighted Sets, the generating function for all multisets of marbles 
satisfying the color restrictions (weighted by the size of the multiset) is 


F(z =27(1— 2°)? (1 — 2). 


We need the coefficient of z*° here, which equals the coefficient of z1° in (1— z®)°(1—z)~®. 
Using the Binomial Theorem to expand (1 — 2°)° and the Negative Binomial Theorem to 
expand (1 — z)~°, we get 


4 
(1 — 5z® + 102"? — 102z'8 + 5274 — 23°). i+ (Qlet Gltaene (OF Jerre]. 


When we multiply this out, the terms involving z°, z®, and z!? in the first factor can be 
matched with the terms involving z!°, z°, and z? (respectively) in the second factor to get 
a total power of z!°. No other pair of terms can be multiplied to give this power of z. Thus, 


the coefficient we need is 
19 13 1 
1 - 1 = 651. 


5.28. Example. Define the weight of a multiset to be the number of objects in the multiset 
(counting repetitions). Let us find the generating function for multisets using the alphabet 
{1, 2,3} where the number of 1’s is unrestricted, the number of 2’s is a multiple of 4, and 
the number of 3’s cannot be a multiple of 3. We can build such a multiset by choosing 
how many 1’s it contains, how many 2’s it contains, and how many 3’s it contains. The 


generating function for the first choice is 1 + 2+ 27 +-+-- = (1— z)7!. The generating 
function for the second choice is 1+ 24 +28 +--+. = (1—2*)~1. The generating function for 
the third choice is 
co co 
e+ etetg tet t+ B+ = 2-2 = (1-2) 1 - (1-28). 
n=0 n=0 


By the Product Rule for Weighted Sets, the required generating function is 
z 
(1—z)8(1+22)(l+2+ 22) 


If we redefine the weight of a multiset to be the sum of all its members, the generating 
function would be 


=2) =e)" (d-2"-0=2)""]= 


(Se tag) a2) Uae 


DT 


5.10 Generating Functions for Trees 


This section illustrates the Sum Rule and Product Rule for Weighted Sets by deriving the 
generating functions for various kinds of trees. 
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5.29. Example: Binary Trees. Let S be the set of all binary trees, weighted by the 
number of vertices. By the definition of binary trees in Example 2.31, every tree t € S is 
either empty or is an ordered triple (¢, t;, t2), where t; and tz are binary trees. Let Sp be the 
one-element set consisting of the empty binary tree, let S* = S—Sp be the set of nonempty 
binary trees, and let N = {e} be a one-element set such that wt(e) = 1. By definition of the 
generating function for a weighted set, we have GF(So; z) = 2° = 1 and GF(N; z) = z! =z. 
By the Sum Rule for Weighted Sets, we conclude that 


GF(S; z) = GF(So;z) + GF(ST; z) =1+ GF(S*; 2). 


By the recursive definition of nonempty binary trees, we can uniquely construct every tree 
t € S* by (i) choosing the root node e from the set N; (ii) choosing the left subtree t1 
from the set S; (iii) choosing the right subtree t2 from the set S$; and then assembling 
these choices to form the tree t = (e,t;,t2). Observe that the weight-additivity condition 
wtgo+(t) = wty(e) + wts(t1) + wts(t2) holds, since the weight of each tree is the number of 
vertices in the tree. It follows from the Product Rule for Weighted Sets that 


GF(St; z) = GF(N; z) GF(S; z) GF(S; z) = z GF(S;z)?. 


Letting F(z) denote the unknown generating function GF(S; z), we conclude that F satisfies 
the equation F(z) = 1+ 2F(z)?, or equivalently zF? — F +1 = 0. Furthermore, F'(0) = 1 
since there is exactly one binary tree with zero vertices. 

We can formally solve this equation for F using the Quadratic Formula: ax? + br +c = 0 
has roots (—b + Vb? — 4ac)/2a. Taking « = F, a = z, b= —1, and c = 1 leads to 


ltvV1—4z 
2Qz 
We use the Extended Binomial Theorem to get the power series expansion of 1 — 42: 


i=)" = >. (Pn (aye 


n=0 
Expanding the definition of the falling factorial, the coefficient of 2” in (1 — 4z)!/? is 
(1/2) (—/2)\(—3/2) +9 —(2n = 3) /2)(—1)" 292” da Le BBs... (2 — 3) 2” 
nn: ee 
To continue simplifying, multiply the numerator and denominator by n!, and notice that 


2°n! = 2-4-6-...-2n. The new numerator is the product of all integers from 1 to 2n except 
2n — 1. Multiplying numerator and denominator by 2n — 1, we get 


“1 (2n)! 
(1-42)? =) oye De — 22" — 42? — 10* — 282° 
fe UN: 

n=0 


Returning to the expression F(z) = (1 + /1 — 4z)/(2z), do we choose the plus or minus 
sign? If we choose the plus sign, the resulting expression contains a term z~', which becomes 
unbounded as z approaches zero. If we choose the minus sign, this term goes away, and 
moreover the constant coefficient is +1 as needed. So we use the minus sign. Deleting the 
constant term of (1 — 4z)!/?, negating, and dividing by 2z, we find that 


= (2n)! p> = (2m + 2)! 
PF = —— se = ln 
(2) dX 2(2n —1)nin! a 2Qm+1(m+)\m+)!~ 
a (2m + 2)(2m +1)! a ee Im+1\ ,, 
= ee = pa 
amr (2m + 1)(2m + 2)m!(m + 1)! 4, 2m+1\m,m+1 
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Taking the coefficient of z” gives the number of binary trees with m vertices, which is the 
Catalan number C,,,. A more combinatorial approach to this result was given in §2.10. 


In the previous example, we can retroactively justify the formal application of the 
Quadratic Formula by confirming that F(z) = (1 — (1 — 4z)!/?)/(2z) really does satisfy 
the functional equation F(z) = 1+ zF(z)? and initial condition F(0) = 1. In particular, 
the power series we found for F’ converges for all z € C with |z| < 1/4. Alternatively, all 
of these algebraic computations can be justified at the level of formal power series, without 
worrying about convergence (see Chapter 11). 


5.30. Example: Full Binary Trees. A binary tree is called full iff for every vertex in 
the tree, the left subtree and right subtree for that vertex are either both empty or both 
nonempty. A vertex whose left and right subtrees are both empty is called a leaf of the tree. 
Let S be the set of nonempty full binary trees, weighted by the number of leaves. S' is the 
disjoint union of S; = {(¢,0,0)} and Ss2 = S—Sj. We can build a tree t in Ss2 by choosing 
any t,; in S as the nonempty left subtree of the root, and then choosing any tz in S as the 
nonempty right subtree of the root. Note that wt(t) = wt(t,) + wt(t2) since the weight is 
the number of leaves. So, by the Product Rule for Weighted Sets, GF(S>2; z) = GF(S; z)?. 
We see directly that GF(S1; z) = z. Letting G(z) = GF(S;z), the Sum Rule for Weighted 
Sets now gives the relation G(z) = z + G(z)?. Moreover, G(0) = 0 since there are no 
nonempty full binary trees with zero leaves. Solving the quadratic equation G? -G+z=0 
by calculations analogous to those in Example 5.29, we find that 
Ge) = 4 —* = erly, 

where F(z) is the generating function considered in the previous example. So for all n > 1, 
the coefficient of z” in G equals the coefficient of z"~! in F, namely the Catalan number 
Cy—1. We conclude that there are C,-1 full binary trees with n leaves for each n > 1. 


The next example illustrates a generalization of the Sum Rule for Weighted Sets, in which 
a weighted set S is written as a disjoint union of infinitely many subsets S;,. By analogy with 
the finite version of the Sum Rule, we are led to the formula GF(S; z) = 0°.) GF(Sz; z), 
but what is the meaning of the infinite sum of power series appearing on the right side? 
For now, we work with this infinite sum informally and see where the calculation leads us. 
A rigorous discussion of infinite sums and products of formal power series is given later 
(see §11.1). 


5.31. Example: Ordered Trees. We recursively define ordered trees as follows. First, the 
list ¢ = (0) is an ordered tree with wt(t) = 1. Second, for every integer k > 1 and every list 
ty, to,..., tx of k ordered trees, the (k+1)-tuple t = (k, ti, t2,...,t,) is an ordered tree with 
wt(t) = 1+ wt(t,)+---+wt(t,). All ordered trees arise by applying these two rules a finite 
number of times. The first rule can be considered a degenerate version of the second rule in 
which k = 0. Informally, the list (k,t1,...,t,) represents a tree whose root has k subtrees 
t;, which appear in order from left to right, and where each subtree is also an ordered tree. 
The weight of a tree is the number of vertices in the tree. 

Let S be the weighted set of all ordered trees. We now find the generating function 
G(z) = GF(S; z). First, we write S as the disjoint union of sets 5;, for k € Zo, where S;, 
consists of all trees t € S such that the root node has & subtrees. By the generalized Sum 
Rule for Weighted Sets, 


G(z) = GF(S;z) = > GF(S;; z). 
k=0 


For fixed & > 0, we can build each tree t = (k,t1,to,...,t,) € S~ by choosing each entry 
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in the sequence. Since the root vertex contributes 1 to the weight of t, the Product Rule 
for Weighted Sets shows that GF(5;;z) = zG(z)*. Substitution into the previous formula 
gives the functional equation 
= z 
Ga)= G(z)* = ——; 
(@) = 30266) = ay} 


k=0 


the last step is a formal version of the Geometric Series Formula (see §11.3 for a rigorous 
justification). This equation simplifies to the quadratic G? — G + z = 0, which is the same 
equation that occurred in the previous example. So we conclude, as before, that 


1-V1-4 = 
G(z) = ~~ =5° 6,12", 
n=1 


where C;,_1 is a Catalan number. Thus, there are C;,_1 ordered trees with n vertices for all 
n> 1. 


5.11 Tree Bijections 


The generating function calculations in the previous section prove the following (perhaps 
unexpected) enumeration result: the set of binary trees with n vertices, the set of full binary 
trees with n+ 1 leaves, and the set of ordered trees with n+ 1 vertices all have cardinality 
C,. Although this is a purely combinatorial statement, the generating function method 
arrives at this result in a very mysterious and indirect fashion, which requires algebraic 
manipulations of quadratic expressions and the power series expansion of /1 — 4z. 

Having found this result with the aid of generating functions, we may ask for a bijective 
proof in which the three sets of trees are linked by explicitly defined bijections. Some 
methods for building such bijections from recursions were studied in Chapter 2. Here we 
are seeking weight-preserving bijections on infinite sets, which can be defined as follows. 

Let S denote the set of all binary trees weighted by number of vertices, and let T be the 
set of all nonempty full binary trees weighted by number of leaves. We define a bijection 
f:S—T recursively by setting f(0) = (e,0,0) and 


f(e,t1,t2) = (@, f(t), f(t). 


See Figure 5.1. We claim that f preserves weights in the sense that wtr(f(t)) = wtg(t) +1 
for all t € S (we add 1 on the right side since GF(T; z) = z GF(S; z)). We verify this claim 
by induction. For the base case, note that the zero-vertex tree () is mapped to the one-leaf 
tree (e, 0,0). For the induction step, suppose t; and t2 are trees in S with a vertices and 
b vertices, respectively. By induction, f(t) and f(t2) are nonempty full binary trees with 
a+ 1 leaves and b + 1 leaves, respectively. It follows that f sends the tree t = (e,t1,t2) 
with a+ 6+ 1 vertices to a full binary tree with (a+ 1) + (b +1) = (a+64+1) +1 leaves, 
as needed. The inverse of f has an especially simple pictorial description: just erase all the 
leaves! Since f~! respects weights, we see that a nonempty full binary tree always has one 
more leaf vertex than internal (non-leaf) vertex. 

Next, let U be the set of all ordered trees, weighted by number of vertices. We define 
a weight-preserving bijection g : T > U. First, define g(e,@,0) = (0). Second, given t = 
(e,t1,t2) € T with t; and tg nonempty, define g(t) by starting with the ordered tree g(t1) 
and appending the ordered tree g(tz) as the new rightmost subtree of the root node of g(t). 
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ty t f(t,) f(t,) 


FIGURE 5.1 
Bijection between binary trees and full binary trees. 


See Figure 5.2. More formally, if g(t1) = (k,ui,...,ux), let g(t) = (kK +1, u1,..., ux, g(te)). 
One may check by induction that the number of vertices in g(t) equals the number of leaves 


in t, as required. 
8g 
4 ty g(t) g(t) 
FIGURE 5.2 


Bijection between full binary trees and ordered trees. 


5.32. Remark. These examples show that generating functions are powerful algebraic tools 
for deriving enumeration results. However, once such results are found, we often seek direct 
combinatorial proofs that do not rely on generating functions. In particular, bijective proofs 
are more informative (and often more elegant) than algebraic proofs in the sense that they 
give us an explicit pairing between the objects in two sets. 
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5.12 Exponential Generating Functions 


This section introduces exponential generating functions (EGFs) for weighted sets. We 
then develop a version of the Product Rule that explains the combinatorial meaning of the 
product of EGFs. 


5.33. Definition: Exponential Generating Functions. Given a set S with weight 
function wt : S —> Zso, the exponential generating function of the weighted set S is 


gwt(u) oo 


an n 
ues n=0 


5.34. Example. Suppose S = Zso and wt(k) =k for k € S. Then a, = 1 for all n € Zs, 
so EGF(S; z) = 07°, 2”/n! = e*. This is why the term “exponential generating function” 
is used. In contrast, GF(S;z) = 772.92" = (1-z)7?. 


Our goal is to find a version of the Product Rule applicable to EGFs. To prepare for 
this, we first recall the rule for multiplying two formal power series (Definition 5.15(e)). 
This rule can be written 


& ou) (3. =: a SY) ab; | 2”. (5.6) 


nav n=0 | (i,j)EZSo: 
i+j=n 
More generally, consider a product of k formal power series F)(z),...,Fx(z). Write Fj(z) = 


ie( 
eo fin” for 1 < 7 <k. The following formula can be proved using (5.6) and induction 
on k;. 


5.35. Theorem: Products of OGFs. For all k € Zso and all formal power series 
F,, Fa, Pee wae with F; = paw: time, 


ROR) RO=S] Yo fafa Saw] 67) 
n=0 \ (1 ,f9,.-4n) EZR: 
dytigt--+i,=n 


Intuitively, this formula arises by expanding the product of k sums on the left side using 
the Distributive Law (see Exercise 2-16). This expansion produces a sum of various terms, 
where each term is a product of k monomials f1,;,2"', foi.2”, -.., fri,2’*. This product 
will contribute to the coefficient of z” iff 7; +72 +--- +72, = n; the contribution to that 
coefficient will be fii, fois +++ fr.i,. Adding over all possible choices of (i1,%2,...,%%) gives 
the right side of (5.7). 

Let us see how the product formula changes when we use EGFs. Now set Gj(z) = 
ro (Gj,n/n!)z" for 1 <j <k. Replacing f;,; by g;,;/i! in (5.7), we get 


Co 
Gi(z)G2(z) +» Gk(z) = > 2 ieee 2”, (5.8) 
n=0 | (41 ,i2,...,i2) €Z% 9: > , 
iy tigts-+ipsn 
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If we multiply and divide the coefficient of z” by n!, a multinomial coefficient appears. Thus 
we can write 


co 
Ald 


Gi(z)Ga(z)---Gk(z) = So > a ran 4.) ®uidaia Be ar 69) 


n=0 (a1 ,i2 geneg in)€ZSo: 
iy tigt:-+ipan 


We can reverse engineer a combinatorial interpretation for the coefficient of z”/n! using the 
Anagram Rule, the Product Rule, and the Sum Rule. This leads to the following Product 
Rule for EGFs. (Compare to the version of the Product Rule for OGFs, given in 5.23.) 


5.36. The EGF Product Rule. Suppose k is a fixed positive integer and S},...,5; are 
weighted sets. Suppose S is a weighted set such that every u € S can be constructed in 
exactly one way as follows. For 1 < 7 < k, choose uj; € S; and let i; = wts,(uj). Then 
choose an anagram w € R(1"'2”---k). Finally, assemble the chosen objects u1,..., Uz, W 
in a prescribed manner. Assume that whenever wu is constructed from uj, u2,...,Uz, Ww, the 
weight-additivity condition wts(u) = i +--- +i, = wtg,(u1) + wtgs,(u2) +--+ + wtg, (ux) 
holds. Then 
EGF(S; z) = EGF(S}; z) - EGF(S9; z)-...- EGF(S,; z). 


The following examples illustrate the use of this rule. 


5.37. Example. Fix a positive integer k, and let S' be the set of all words using the alphabet 
{1,2,...,k}, weighted by length. On one hand, the definition of EGFs and the Word Rule 
show that 


On the other hand, we can derive this result from the EGF Product Rule as follows. Take 
all of the sets $1, 52,...,5; to be the weighted set Z>9 from Example 5.34. We can build 
an arbitrary word u in S as follows. Choose i; € $;, which represents the number of times 
1 will appear in u. Choose iz € S23, which is the number of 2’s in u. Choose i3,..., 7% simi- 
larly. Finally, choose an anagram w € R(1"---k**), and let u = w. The weight-additivity 
condition holds, since the length of wu is i; +72 +--:+ ip. By the EGF Product Rule and 
Example 5.34, 
EGF(S; z) = [EGF(Zs9; z)|* = (e*)* =e™. 


We did not really need the EGF Product Rule in the last example, since we already 
knew how many words of length n there are. But by modifying the sets S; in the previous 
example, we can use EGFs to solve far more complicated counting problems. 


5.38. Example. How many 15-letter words using the alphabet {1,2,3,4} have an odd 
number of 1’s, an even number of 2’s, and at most three 4’s? To solve this, we find the EGF 
for the set S of all words (of any length) satisfying the given restrictions, and extract the 
coefficient of z!°. For this problem, take S$; to be the set of odd positive integers, S2 to 
be the set of even nonnegative integers, S3 = Zso, and S, = {0,1,2,3}. In each case, let 
wts,(%) = 7 for i € S;. By definition, the EGF for 5; is 


EGF(S}; z) = S- 2” inl =2+2°/3!+2°/S!+27/7++-- =sinhe. 
n odd 


Similarly, the EGF for S2 is cosh z, the EGF for S3 is e*, and the EGF for Sy is 1+ 2+ 
27/2! + 23/3!. We build u € S by first choosing i; € $1, i2 € S2, i3 € 93, and i4 € $4, where 
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i; is the number of j’s that will appear in u. Then we choose w € R(1"!2*2334"4) and set 
u = w. Since the length of u is 71 +72 +73 +74, the EGF Product Rule applies. We conclude 
that 

EGF(S; z) = (sinh z)(cosh z)e?(1 + z + 27/2 + 23/6). 


Using a computer algebra system, we find the coefficient of z!° in this power series and 
multiply by 15! to obtain the answer 123,825,662. 


5.39. Example: Surjections. Fix a positive integer k. Let us find the EGF for the set 
S of all surjective functions mapping some set {1,2,...,n} onto {1,2,...,k}, where n > k. 
Take the weight of a function f in S to be the size n of the domain of f. On one hand, 
by the Surjection Rule, we know that the number of objects of weight n in S' is S(n,k)k!, 
where S(n,k) is a Stirling number of the second kind. So by definition, the EGF of S is 
EGF(S; z) = 7, S(n, k) k=. On the other hand, we can build objects in S' by the EGF 
Product Rule, as Hollow. Take all of the sets S1,...,.5, to be Zso weighted by wt(i) = 
Then EGF(S;;z) = 0°, 2"/n! = e? —1 for 1 < j < k. Build f € S as follows. om 
positive integers i; € 91,...,ix € Sx, and then choose an anagram w € R(1"---k'*). Let 
n =i, +---+i%,, and define f(r) = w, for 1 <r <n. This function f is surjective since 
every i; is strictly positive. The EGF Product Rule applies to show EGF(S; z) = (e* — 1)*. 
Comparing to the previous expression for this EGF, we deduce the following generating 
function identity for Stirling numbers of the second kind: 


2” (e* —1)* 
Y sin 


We can also interpret this example as finding the EGF for words w in the alphabet 
{1,2,...,k} that use every letter at least once. 


In all the examples so far, every set S; had at most one element of any given weight 
n, and the final object constructed by the EGF Product Rule was the anagram w. The 
next example illustrates a more complex situation where objects chosen from the sets S; 
are combined with w to build a more elaborate structure. 


5.40. Example: Digraphs with k Cycles. Fix a positive integer k. Let S' be the set of 
functional digraphs G such that the vertex set of G is {1,2,...,n} for some n > k; G consists 
of k disjoint directed cycles; and each cycle in G has been given a distinct label chosen from 
{1,2,...,k}. Let the weight of G be n, the number of vertices in G. Recalling the definition 
of c(n,k) from §3.6, we see that the EGF for S is EGF(S;z) = S0°-, c(n, k)klz"/n!. We 
obtain another formula for this EGF using the EGF Product Rule. For 1 < 7 < k, let S; be 
the set of permutations of {1,2,...,n} (for some n > 1) where the lala agate digraph is 
a single directed cycle. There are (n — 1)! such permutations of {1,2,...,n}, and therefore 


EGF(S )= Los r= oS =~ beat») 


We build an object in S as follows. Choose cycles gi € 51,..., gx € Sz of respective lengths 
i1,.--,%,, choose w € R(1%---k*), and let n = i; +--+ + ix. Suppose the i, copies of 
1 in w occur in positions p(1) < p(2) < ... < p(ii). Replace the symbols 1,2,...,%, in 
the cycle g; by the symbols p(1), p(2),...,p(i1) (respectively) to get a cycle of length 7 
involving the symbols p(1),...,p(t1). Attach the label 1 to this cycle. Renumber the entries 
in the other cycles g2,..., 9, similarly. In general, if the i; copies of j in w occur in positions 
p(1) < p(2) <--- < p(2;), then we replace the symbols 1, 2,...,i; in cycle g; by the symbols 
p(1), p(2),...,p(i;) in this order, and we attach the label 7 to this cycle. The union of all 
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the renumbered cycles is a functional digraph on {1,2,...,} consisting of k cycles labeled 
1,2,...,k. Finally, by the EGF Product Rule, EGF(S;z) = []j_, EGF(S;; 2), so 


= z” _ [-Log(1— z)]* 
Dae o(n, k)— = =. (5.10) 


5.13 Stirling Numbers of the First Kind 


So far, we have considered generating functions in a single variable z. It is sometimes 
necessary to use generating functions involving two or more variables. For example, suppose 
Sis aset of objects and wt; and wtg are two weight functions for S. The ordinary generating 
function for S' relative to these weights is 


F(S;t, z) Se wie b> aa ar 


ues n=0 k=0 


where d%n = \{u € S': wti(u) = k and wt2(u) = n}|. We can think of GF(S;t,z) as a 
formal power series in z, where each coefficient is a formal power series in t. We could also 
consider the exponential generating function 


EGF(S;t, z) 2a 2, 


n=0 k=0 


or a mixed version where only one variable is divided by a factorial. 

In this section and the next, we illustrate manipulations of two-variable generating 
functions by developing generating functions for the Stirling numbers of the first and second 
kind. Recall from §3.6 that the signless Stirling number of the first kind, denoted c(n, k), 
counts the number of permutations of n objects whose functional digraphs consist of k 
disjoint cycles. These numbers are determined by the recursion 


c(n, k) = cn —1,k —1) + (n— 1)c(n — 1,k) forO<k<n, 


with initial conditions c(n,0) = 0 for n > 0, c(n,n) = 1 for n > 0, and c(n,k) = 0 for k < 0 
ork >n. 
We define the mixed generating function for the numbers c(n, k) by 


F(t,z)= :> Ye CB phn 


n=0 k=0 


Observe that the coefficient of t*z” in F, namely c(n,k)/n!, is the probability that a ran- 
domly chosen permutation of n objects will have k cycles. We will use the recursion for 
c(n, k) to show that F’ satisfies the partial differential equation (1 — z)OF'/0z = tF’, where 
OF /0z denotes the formal partial derivative of F' with respect to the variable z. Using the 
recursion, compute 


OF we ne(n,k) pn ny Caw ne(n —1,k —1) + (n—1)c(n—-1,F)] 4 n-1 


—1 he 1k n—-1 
= emt = + ee ; 
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In the first summation, let m =n-—1 and j = k—1. After discarding summands equal to 


zero, we see that 


m 


aoe ONSEN Meet S Diem = oF 


n=0 k=0 m=0 j=0 


On the other hand, letting m =n — 1 in the second summation shows that 


5 a ha = 54 (m, bye a 
=i 


n=0 k=0 m=0 k=0 


since Z(z™/m!) = z™~1/(m—1)!. So we indeed have OF /0z = tF + zOF/0z, as claimed. 

We now know a formal differential equation satisfied by F(t, z), together with the initial 
condition F'(t,0) = 1. To find an explicit formula for F’, we solve for F' by techniques that 
may be familiar from calculus. However, one must remember that all our computations need 
to be justifiable at the level of formal power series. We defer these technical matters for the 
time being (see Chapter 11). To begin solving, divide both sides of the differential equation 
(1 — z)OF/0z =tF by (1 — z)F, obtaining 


OF /0z t 
F l-2 


We recognize the left side as the partial derivative of Log|F'(t, z)] with respect to z. On 
the other hand, the right side is the partial derivative of Log|(1 — z)~‘] with respect to z. 
Taking the antiderivative of both sides with respect to z, we therefore get 


Log| F(t, z)] = Log[(1 — z)*] + C(). 


By letting z = 0 on both sides, we see that the integration constant C(t) is zero. Finally, 
we exponentiate both sides to arrive at the generating function 


F(t,z) =(1—2)~*. (5.11) 


Having discovered this formula for F’, we can now give an independent verification of its 
correctness by invoking our earlier results on Stirling numbers and generalized powers: 


= Din (—z)”" by the Extended Binomial Theorem 
n=0 


(l-2) 


= o vin ——z" by Definition 2.63 


n=0 
lo) n yn 

= m2 Se v") — by Theorem 2.60 
n=0 \k=0 ve 


5.14 Stirling Numbers of the Second Kind 


In this section, we derive a generating function for Stirling numbers of the second kind. 
Recall from §2.12 that S(n,k) is the number of set partitions of an n-element set into k 
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nonempty blocks. Define the two-variable generating function 


z)= x SS SOB) phon 


n=0 k=0 
The following recursion will help us find a differential equation satisfied by G. 


5.41. Theorem: Recursion for Stirling Numbers. For alln >Oand0<k<n+1, 


n 


sin +1,k) = > (") Sina =1), 


i=0 
The initial conditions are S(0,0) = 1 and S(n,k) =0 whenever k < 0 ork > n. 


Proof. Consider set partitions of {1,2,...,n-+1} into k blocks such that the block containing 
n-+1 has i other elements in it (where 0 < i < n). To build such a set partition, choose 
the 2 elements that go in the block with n+ 1 in (7) ways, and then choose a set partition 
of the remaining n — 7 elements into k — 1 blocks. The recursion now follows from the Sum 
Rule and Product Rule. (Compare to the proof of Theorem 2.46.) oO 


5.42. Theorem: Differential Equation for G. The generating function G(t,z) = 
ro Dopeo S(n, k)t*2”/n! satisfies G(t,0) = 1 and 0G/dz = te*G. 


Proof. The partial derivative of G with respect to z is 


Ig 
[o-e) m,k co n+ ae); 
= = eer 
m=0 k=0 n=0 k=0 


where we have set n = m— 1. Using Theorem 5.41 transforms this expression into 


a 


co nt+1 n ‘ co n+l n Ler Zz 
weal )se-sk-yeter = EG 


n=0 k=0 i=0 n=0 k=0 i=0 
Setting 7 = k — 1, the formula becomes 
i,j) ee . 2 ay ae S(n—i,j)tiz™* 2 
t ————————————— ee iF ———————————— eS Fae. 
yyy s (n—i) i! py (n—i)! i! 
n=0 j7=0 1=0 n=0 i=0 j= 


Finally, by the definition of the product of formal power series in z, the last expression 
equals 


m 


(5), 3 yD 2” | =teG. Oo 


i=0 m=0 | 7=0 


5.43. Theorem: Generating Function for Stirling Numbers of the Second Kind. 


3 ye S(n,k) ie, n elle 1) 


n=0 k=0 


Proof. We solve the differential equation 0G/0z = te*G with initial condition G(t,0) = 
1. Dividing both sides by G and taking the antiderivative with respect to z, we get 
Log(G(t, z)) = te* + C(t). Setting z = 0 gives 0 = t + C(t), so the constant of integra- 
tion is C(t) = —t. Exponentiating both sides, we find that 


G(t,z) = git etl 1), O 
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5.44. Theorem: Generating Function for S(n,k) for fixed k. For all k > 0, 
= zh Ts : 
So S(n,k)— = Fle" — Due 
n=k 

Proof. Using e“ = 7°.) u*/k! with u = t(e* — 1), we have 


iad z k 
t(e*—1) _ pie = 1) 
e a 2 t a ; 
=0 


Extracting the coefficient of t* and using Theorem 5.43, we obtain the required formula. 
(Compare to Example 5.39, which derives the same formula using the EGF Product Rule.) 
O 


5.15 Generating Functions for Integer Partitions 


This section uses generating functions to prove some fundamental results involving integer 
partitions. Recall from §2.11 that an integer partition is a weakly decreasing sequence of 
positive integers. The area of a partition w = (141,..., Ue) is |u| = wi +--+ + px, the sum of 
the parts of yz. Let Par denote the set of all integer partitions, weighted by area. 

To get a generating function for Par, we need an extension of the Product Rule for 
Weighted Sets involving infinite products. We introduce this extension informally here, 
delaying a rigorous discussion until §11.2. Intuitively, in this version of the Product Rule, 
we build objects in a given set S by making a potentially infinite sequence of choices. 
However, we require that any particular object in the set can be completely built after 
making only finitely many choices, where the number of choices needed depends on the 
object. If the weight-additivity condition holds, then GF(S; z) is the infinite product of the 
generating functions for each separate choice in the construction sequence. The following 
theorem gives a concrete illustration of this setup. 


5.45. Theorem: Partition Generating Function. 


GF(Par; z) = > zl#l — I 7 2 a 


wePar i=1 


Proof. We use the infinite version of the Product Rule for Weighted Sets. We build a typical 
partition yz © Par by making an infinite sequence of choices, as follows. First, choose how 
many parts of size 1 will occur in js. The set of possible choices here is Z>o = {0, 1, 2,3,...}. 
For this first choice, we let wt(k) = k for k € Zs. The generating function for this choice is 
It2+27+234---=(1—z)~1. Second, choose how many parts of size 2 will occur in yw. Again 
the set of possibilities is {0, 1,2,3,...}, but now we use the weight wt(k) = 2k. The reason is 
that including k parts of size 2 in the partition jy will contribute 2k to ||. So the generating 
function for the second choice is 1 + z? + z* +26 +--+. = (1 — z?)71. Proceed similarly, 
choosing for every i > 1 how many parts of size i will occur in ys. Since choosing k parts 
of size 7 increases |ju| by ki, the generating function for choice i is )P°)(z’)* = (1-24) 7. 
Multiplying the generating functions for all the choices gives the infinite product in the 
theorem. O 
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It may be helpful to consider a finite version of the previous proof. For fixed N € Zso, 
let ParS™ be the set of all integer partitions jz: such that pz, < N for all i, or equivalently 
Jy < N. Using the finite version of the Product Rule for Weighted Sets, the construction 
in the preceding proof shows that 


N 
1 
<N. 4) le] — 
GF(Par="; z) = y git ge ler 


weParsN 


Informally, when we take the limit as N goes to infinity, the sets ParS% get larger and 
larger and approach the full set Par consisting of all integer partitions. More precisely, 
da Par<% = Par. So it is plausible that taking the limit of the generating functions of 
the sets ParS% should give the generating function for the full set Par: 


1 rr ol 
lim GF(Pars%; z) = GF(Par; z), or lim II 


N00 Nene, twee 


The only difficulty is that we have not yet precisely defined the limit of a sequence of 
(formal) power series. We return to this technical point in §11.1. 

We can add another variable to the partition generating function to keep track of addi- 
tional information. Recall that, for w € Par, 11 is the length of the first (longest) part of yu, 
and ¢() is the number of nonzero parts of ju. 


5.46. Theorem: Enumerating Integer Partitions by Area and Length. 


co 


Ss) elu! = TT] — = SD pedal, 


uwePar i=1 uwePar 


Proof. To prove the first equality, we modify the proof of Theorem 5.45 to keep track of the 
t-weight. At stage 7, suppose we choose k copies of the part i for inclusion in yp. This will 
increase ¢() by & and increase |ju| by ki. So the generating function for the choice made at 
stage 7 is 


co [oe 1 

peyki — tyt\e = 7 
eee gs 
k=0 k=0 


The result now follows from the infinite version of the Product Rule for Weighted Sets. 
To prove the second equality, observe that conjugation is a bijection on Par that preserves 
area and satisfies (js); = (4). So the result follows from the Bijection Rule for Weighted 
Sets. O 


We can use variations of the preceding proofs to derive generating functions for various 
subcollections of integer partitions. 


5.47. Theorem: Partitions with Odd Parts. Let OddPar be the set of integer partitions 
all of whose parts are odd. Then 


Co 


1 
» zit = II 1 — g2k-1° 


we OddPar k=1 


Proof. Repeat the proof of Theorem 5.45, but now make choices only for the odd part 
lengths 1, 3, 5, 7, etc. O 
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5.48. Theorem: Partitions with Distinct Parts. Let DisPar be the set of integer 
partitions all of whose parts are distinct. Then 


Co 


oa givl — [[¢ +2"). 


we DisPar i=l 


Proof. We build a partition w € DisPar via the following choice sequence. For each part 
length i > 1, either choose to not use that part in y or to include that part in 4 (note that 
the part is only allowed to occur once). The generating function for this choice is 1 + 2’. 
The result now follows from the infinite version of the Product Rule for Weighted Sets. O 


By comparing the generating functions in the last two theorems, we are led to the 
following unexpected result. 


5.49. Theorem: OddPar vs. DisPar. 


yee 


uweOddPar ve€DisPar 


Proof. We formally manipulate the generating functions as follows: 


foe) 
yy zl = Cc 
1 — 22k-1 
weOddPar k=1 
co 


1 7, 1 sn J 
= 1 — 22k-1 | = []a-- 
k=1 j= j=l 
oo 1 oo 
= TJ —sIle-0+2) 
i=1 j=l 
oo 1 love) oo 
= TI -sIle-)JIe+~ 
g=1 pi. j=l 
= ][a+2)= x zél. oO 
j=l v€DisPar 


This “proof” glosses over many technical details about infinite products of formal power 
series. For example, when going from the first line to the second line, it is permissible to 
multiply by 1 in the form Tes (1—277)-1 cea (1 — 274). (This can be justified rigorously 
by induction on N.) But how do we know that this manipulation is allowed for N = oo? 

Similarly, to reach the third line, we need to know that 


1 1 1 
II 1-2 I iow ~U is 


i>1,i odd i>1,i even i>1 


Regarding the fourth line, how do we know that []7°,[(1 — 2’)(1 + 2’)] can be split into 
[Tj — 2’) [p21 (1 + 27)? And on the final line, we again need to know an infinite version 
of the cancellation property (1 — z4)~1(1 — 27) = 1. 

Each of these steps can indeed by justified rigorously, but we defer a careful discussion 
until Chapter 11. The main point now is that we must not blindly assume that algebraic 
manipulations valid for finite sums and products will automatically carry over to infinite 


sums and products (although they often do). 
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5.16 Partition Bijections 


We saw above that the generating function for partitions with odd parts (weighted by 
area) coincides with the generating function for partitions with distinct parts. We gave an 
algebraic proof of this result based on manipulation of infinite products of formal power 
series. However, from a combinatorial standpoint, we would like to have a bijective proof 
that GF(OddPar; z) = GF(DisPar; z). By the Bijection Rule for Weighted Sets, it is enough 
to construct an area-preserving bijection F’ : OddPar — DisPar. Two such bijections are 
presented in this section. 


5.50. Sylvester’s Bijection. Define F' : OddPar — DisPar as follows. Given yp € OddPar, 
draw a centered version of the Ferrers diagram of yu in which the middle boxes of the parts of 
are all drawn in the same column; an example is shown in Figure 5.3. Note that each part 
of 4 does have a middle box, because the part is odd. Label the columns in the centered 
diagram of pw as —k,...,—2,—1,0,1,2,...,k from left to right, so the center column is 
column 0. Label the rows 1,2,3,... from top to bottom. We define v = F(w) by dissecting 
the centered diagram of 4 into a sequence of disjoint L-shaped pieces (described below), 
and letting the parts of vy be the number of cells in each piece. The first L-shaped piece 
consists of all cells in column 0 together with all cells to the right of column 0 in row 1. 
The second L-shaped piece consists of all cells in column —1 together with all cells left of 
column —1 in row 1. The third piece consists of the unused cells in column 1 (so row 1 is 
excluded) together with all cells right of column 1 in row 2. The fourth piece consists of the 
unused cells in column —2 together with all cells left of column —2 in row 2. We proceed 
similarly, working outward in both directions from the center column, cutting off L-shaped 
pieces that alternately move up and right, then up and left (see Figure 5.3). 


piece2 < > piece 1 
piece4 <= > piece 3 
piece6 <= > piece 5 
piece8 ~< >piece 7 

piece 10 = = piece 9 


F( (13,13,11,11,11,7,7,7,7,5,3,3,1,1,1) ) = (21,17,16,13,11,9,8,3,2,1) 


FIGURE 5.3 
Sylvester’s partition bijection. 


We see from the geometric construction that the size of each L-shaped piece is strictly 
less than the size of the preceding piece. It follows that F(z) = v = (4. > 12 > ---) is 
indeed an element of DisPar. Furthermore, since || is the sum of the sizes of all the L- 
shaped pieces, the map F' : OddPar > DisPar is area-preserving. We must also check that 
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F is a bijection by constructing a map G : DisPar + OddPar that is the two-sided inverse 
of F. 

To see how to define G, let us examine more closely the dimensions of the L-shaped 
pieces that appear in the definition of Fw). Note that each L-shaped piece consists of a 
corner square, a vertical portion of zero or more squares below the corner, and a horizontal 
portion of zero or more squares to the left or right of the corner. Let yo be the number 
of cells in column 0 of the centered diagram of pz (so yo = €()). For all ¢ > 1, let 2; 
be the number of cells in the horizontal portion of the (2i — 1)th L-shaped piece for wp. 
For all i > 0, let y; be the number of cells in the vertical portion of the 2ith L-shaped 
piece for y. For example, in Figure 5.3 we have (yo, y1, y2,---) = (15, 11,8,6,1,0,0,...) and 
(a1, ¥2,...) = (6,5,3,2,1,0,0,...). Note that for alli > 1, yi-1 > y; whenever y;-1 > 0, 
and x; > 241 whenever x; > 0. Moreover, by the symmetry of the centered diagram of ju 
and the definition of F’', we see that 


44=¥o 7 1, 2=%1T V1; 
V3 = Y1 1 V2, V4 = ©2 7 Y2, 
V5 = Y2 7 V3, Y¥6 = %3 7 Y3, 
and, in general, 
Y2i-1 = Yi-1 + Xj for all i = ls Voi = 4+ Yi for all i = 1; (5.12) 


To compute G(v) for v € DisPar, we need to solve the preceding system of equations for 
x; and y;, given the part lengths v;. Noting that vz, x,, and yz, must all be zero for large 
enough indices k, we can solve for each variable by taking the alternating sum of all the 
given equations from some point forward. This forces us to define 


Yi = Vri41 — Voge +243 —Voigat+--: for alli > 0; 


Ui = Vai — Voig4i + Yoige — Voi43t:°: for alli > 1. 


It is readily verified that these choices of 7; and y; do indeed satisfy the equations v2;_, = 
Yi-1 + 2 and vz; = x; + y;. Furthermore, because the nonzero parts of v are distinct, the 
required inequalities (y;-1 > y; whenever y;-1 > 0, and 2; > xj41 whenever x2; > 0) also 
hold. Now that we know the exact shape of each L-shaped piece, we can fit the pieces 
together to recover the centered diagram of 4 = G(v) € OddPar. For example, given 
vy = (9,8,5,3,1,0,0,...), we compute 


yo = 9-84+5-341=4 
rm = 8-54+3-1=5 

yw = 5-341=3 

t = 3=—1=2 

yo = 1 


Using this data to reconstitute the centered diagram, we find that G(v) = (11,7,5,3). To 
finish, note that bijectivity of F follows from the fact that, for each vy € DisPar, the system 
of equations in (5.12) has exactly one solution for the unknowns ; and y;. 


5.51. Glaisher’s Bijection. We define a map H : DisPar — OddPar as follows. Each 
integer k > 1 can be written uniquely in the form k = 2°c, where e > 0 and c is odd. 
Given v € DisPar, we replace each part k in v by 2° copies of the part c (where k = 2°c, as 
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above). Sorting the resulting odd numbers into decreasing order gives us an element H(v) 
in OddPar such that |H(v)| = |v|. For example, 


H(15,12,10,8,6,3,1) = sort(15,3,3,3,3,5,5,1,1,1,1,1,1,1,1,3,3,3, 1) 
(15,5, 5,3, 3,3, 3,3, 3,3, 1,1,1,1,1,1,1,1,1). 


The inverse map kK : OddPar — DisPar is defined as follows. Consider a partition 
p € OddPar. For each odd number c that appears as a part of y, let n = n(c) > 1 be the 
number of times ¢ occurs in yw. We can write n uniquely as a sum of distinct powers of 2 
(this is the base-2 expansion of the integer n). Say n = 2% + 2% +... +424. We replace 
the n copies of c in pt by parts of size 2%c, 2c, ..., 24°c. These parts are distinct from one 
another (since d1,...,d; are distinct), and they are also distinct from the parts obtained 
in the same way from other odd values of c appearing as parts of yw. Sorting the parts thus 
gives a partition K(j) € DisPar. For example, 


K (7,7, 7,7, 7,3, 3, 3, 3,3, 3, 1,1, 1) = sort(28, 7, 12,6, 2,1) = (28, 12,7,6,2,1). 
It is readily verified that Ho K and K o H are identity maps. 
Glaisher’s bijection generalizes to prove the following theorem. 


5.52. Theorem: Glaisher’s Partition Identity. For all d > 2 and N > 0, the number 
of partitions of N where no part repeats d or more times equals the number of partitions 
of N with no part divisible by d. 


Proof. For fixed d, let A be the set of partitions where no part repeats d or more times, 
and let B be the set of partitions with no part divisible by d. It suffices to describe weight- 
preserving maps H: A— Band K : B—- Asuch that Ho K and K oH are identity maps. 
We define K by analogy with what we did above for the case d = 2. Fix yw € B. For each 
c that appears as a part of py, let nm = n(c) be the number of times the part ¢ occurs in p. 
Write n in base d as 


n= So axd* where 0 < ag < d for all k, 
k=0 


and n,ao,...,@s all depend on c. To construct K (ju), we replace the n copies of ¢ in ys by 
ag copies of d°c, a, copies of d'c, ..., az copies of d*c, ..., and a; copies of d°c. One checks 
that the resulting partition does lie in the codomain A, using the fact that no part c of yu 
is divisible by d. 

To compute H(v) for v € A, note that each part m in v can be written uniquely in the 
form m = d*c for some k > 0 and some c = c(m) not divisible by d. Adding up all such 
parts of v that have the same value of c produces an expression of the form >°,.5 azd*c, 
where 0 < ax < d by definition of A. To get H(v), we replace all these parts by }>,.59 a,d* 
copies of the part c, for every possible c not divisible by d. Comparing the descriptions of 
H and K, it follows that these two maps are inverses. O 


5.53. Remark: Rogers—Ramanujan Identities. A vast multitude of partition identi- 
ties have been discovered, which are similar in character to the one we just proved. Two 
especially famous examples are the Rogers—Ramanujan Identities. The first such identity 
says that, for all N, the number of partitions of N into parts congruent to 1 or 4 modulo 5 
equals the number of partitions of N into distinct parts vy, > v2 >--- > v, > 0 such that 
V;,—WYi41 > 2 for 7 in the range 1 <7 < k. The second identity says that, for all N, the 
number of partitions of N into parts congruent to 2 or 3 modulo 5 equals the number of 
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partitions of N into distinct parts vy) > v2 >--: > Vv, > 0 = vey. such that yj; -— 441 > 2 
for 7 in the range 1 <i < k. One can seek algebraic and bijective proofs for these and other 
identities. Proofs of both types are known for the Rogers-Ramanujan Identities, but the 
bijective proofs are all quite complicated. 


(ii 


5.17 Euler’s Pentagonal Number Theorem 


We have seen that [];2, (1+ 2’) is the generating function for partitions with distinct parts, 
whereas ]];~,(1 — 2*)~' is the generating function for all integer partitions. This section 
investigates the infinite product [J;2,(1 — 2’), which is the multiplicative inverse for the 
partition generating function. The next theorem gives the power series expansion for this 
infinite product, which has a surprisingly high fraction of coefficients equal to zero. This 
expansion leads to a remarkable recursion for p(n), the number of partitions of n. 


5.54. Euler’s Pentagonal Number Theorem. 


[[a-~) = 1+ $0 (-1) [2Ge-V/? 4. znGnth/2) 
w=1 n=1 


S i gagtiy it gS yn 


Proof. Consider the set DisPar of integer partitions with distinct parts, weighted by area. 
For pt € DisPar, define the sign of ps to be (—1)““). By modifying the proof of Theorem 5.48 
to include these signs, we obtain 


te —z)= S- (—1)) gle, 
i=1 we DisPar 


We now define an ingenious weight-preserving, sign-reversing involution J on DisPar 
(due to Franklin). Given a partition pp = (J. > U2 > +--+ > fs) € DisPar, let a > 1 be the 
largest index such that the part sizes ju, 2,..., Wa are consecutive integers, and let b = ps 
be the smallest part of w. Figure 5.4 shows how a and b can be found by visual inspection 
of the Ferrers diagram of jz. For most partitions , we define I as follows. If a < b, let I() 
be the partition obtained by decreasing the first a parts of by 1 and adding a new part 
of size a to the end of wu. If a > b, let I(s) be the partition obtained by removing the last 
part of yz (of size b) and increasing the first b parts of 4 by 1 each. See the examples in 
Figure 5.4. I is weight-preserving and sign-reversing, since I(j:) has one more part or one 
fewer part than ju. It is also routine to check that I(I(j)) = yu. Thus we can cancel out all 
the pairs of objects {j, I(u)}. 

It may seem at first glance that we have canceled all the objects in DisPar! However, 
there are some choices of yw where the definition of J(u) in the previous paragraph fails 
to produce a partition with distinct parts. Consider what happens in the overlapping case 
a = ¢(p). If b = a+1 in this situation, the prescription for creating I(j1) leads to a partition 
whose smallest two parts both equal a. On the other hand, if b = a, the definition of I() 
fails because there are not enough parts left to increment by 1 after dropping the smallest 
part of jz. In all other cases, the definition of J works even when a = ¢(1). We see now that 
there are two classes of partitions that cannot be canceled by I (see Figure 5.5). First, there 
are partitions of the form (2n,2n—1,...,n+1), which have length n and area n(3n + 1)/2, 
for all n > 1. Second, there are partitions of the form (2n — 1,2n — 2,...,n), which have 
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FIGURE 5.4 
Franklin’s partition involution. 


length n and area n(3n — 1)/2, for all n > 1. Furthermore, the empty partition is not 
canceled by J. Adding up these signed, weighted objects gives the right side of the equation 
in the theorem. O 


We can now deduce Euler’s recursion for counting integer partitions, which we stated 
without proof in §2.11. 


5.55. Theorem: Partition Recursion. For every n € Z, let p(n) be the number of 
integer partitions of n. The numbers p(n) satisfy the recursion 


p(n) = p(n—1)+p(n— 2) —p(n—5) — p(n — 7) + p(n — 12) + p(n — 15) — +: 
= LE (-D* [p(n — k(8k — 1)/2) + p(n — k(3k + 1)/2)] 


(5.13) 
for all n > 1. The initial conditions are p(0) = 1 and p(n) = 0 for all n < 0. 
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at+1 
a 
FIGURE 5.5 
Fixed points of Franklin’s involution. 
Proof. We have proved the identities 
1 My n 
a= Yell =X ote 
i=l ue Par n=0 
[[¢ _ z') = 1+ $0 (-1)¥[2hGR D2 sgt) 2) 
i=l k=1 


The product of the left sides of these two identities is 1, so the product of the right sides is 
also 1. Thus, for each n > 1, the coefficient of z” in the product 


(d-nime") : (: +(e tore ston) 
n=0 k=1 


is zero. This coefficient also equals p(n) — p(n — 1) — p(n — 2) + p(n — 5) + p(n —7) —--- 
Solving for p(n) yields the recursion in the theorem. Oo 


Summary 


e Definitions of Generating Functions. Given a set S with a weight function wt : S + Zso 
such that a, = |{u € S : wt(u) = n}| is finite for all n > 0, the ordinary generating 
function (OGF) for S is GF(S;z) = O72, a@nz". The exponential generating function 


(EGF) for S is EGF(S;z) = 7, an 2” /n. Many generating functions can be viewed 
analytically, as analytic functions of a complex variable z specified by a power series that 
converges in a neighborhood of zero. All generating functions can be viewed algebraically, 
by defining the formal power series )->° 9 dn2” to serve as notation for the infinite sequence 


(an :n > 0). 


e Convergence of Power Series. Given a complex sequence (a,,), there exists a radius of 
convergence R € [0,00] such that for all z € C with |z| < R, 07° a,z” converges 
absolutely, and for all z € C with |z| > R, this series diverges. The radius of convergence 
can be computed as 


1 


liege 47 lal 


R= lim and R= 


nN—-+oo 


An+1 


when these limits exist in [0, co]. 
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e Power Series and Analytic Functions. A power series )>° 9 a,2" with radius of conver- 
gence R > 0 defines an analytic function G on the open disk D(0; R), and a, = G‘™ (0)/n!. 
Conversely, any analytic function G defined on D(0;R) has the power series expansion 
G(z) = 2, G™ (0)z"/nl, valid for all z in this disk. 


e Examples of Power Series. Some frequently used power series include: 


— Exponential Function: e? = 0°, 2"/n! (valid for all z € C); 

— Logarithm Function: Log(1 + z) = 307, (-1)"~'z"/n (valid for |z| < 1); 

— Geometric Series: (1 — bz)~* = S0P°., bz” (valid for |z| < 1/|6|); 

— Negative Binomial Expansion: (1 — bz)~“” = 7°, Geratey) b"z” (where m € Zso 
and |z| < 1/|b|); 

— General Binomial Expansion: (1 + 2)” = 77°_o(r)bn 2"/n! (valid for |z| < 1). 


e Operations on Formal Power Series. Given F = y S Anz” and G= pele bnz”, F=G 
iff a,, = b, for all n > 0. Algebraic operations on formal power series are defined as follows: 


F+G= a +bn)z", cF= 5 oes 
n=0 n=0 
oF = > An—K2”", fG= » (>: cnn) ae 
onak Lem k=0 
r= Son +1)any12",  2zF= » NAnz”. 
n=0 n=0 


Similar formulas hold for analytic power series. 


e Solving Recursions via Generating Functions. When a sequence (a,,) is defined by a re- 
cursion and initial conditions, the generating function F(z) = S°>° 9 anz” satisfies an 
algebraic equation determined by the recursion. To find this equation, subtract the terms 
of F coming from the initial conditions, apply the recursion, and simplify. One can often 
get an explicit expression for a, by finding the partial fraction decomposition of F’ and 
using the Negative Binomial Theorem. 


e Evaluating Summations via Generating Functions. Given F(z) = >7-9 an2”, the coeffi- 


cient of 2” in the power series expansion of F(z)/(1— z) is 0/9 ax. This coefficient can 
often be found using partial fraction decompositions and the Negative Binomial Theorem. 


e Generating Function for Derangements. Let d, be the number of derangements of 
{1,2,...,n} (permutations with no fixed points). The EGF for these numbers is 


y,odne tse */(l—2): 


e Sum Rule for Weighted Sets. If the weighted set S is the disjoint union of weighted sets 
Si,..., 5% and wtg,(u) = wtgs(w) for all wu € S;, then 


GF(S; z) = GF($1; z) + GF(So; z) +--+ + GF(S,; z). 


e Product Rule for Weighted Sets. Given fixed k € Zso and weighted sets S,51,...,S%, 
suppose every u € S can be constructed in exactly one way by choosing ui € Si, ug € 
Sg, ..., Uk © Sp, and assembling these choices in some prescribed manner. If wt(u) = 


eel wt(u,;) for all u € S, then 


GF(S; z) = GF(S}; z)- GF(S2; z)-...- GF(S,; 2). 
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e Biection Rule for Weighted Sets. If f : S — T is a weight-preserving bijection between 
two weighted sets, then GF(S; z) = GF(T; z). 
e Generating Functions for Trees. The formal power series 


se =n a 3 1 oa 


2n—1\n-I1,n 
n=1 


is the generating function for the following sets of weighted trees: (a) binary trees, weighted 
by number of vertices plus 1; (b) nonempty full binary trees, weighted by number of leaves; 
(c) ordered trees, weighted by number of vertices. 


e EGF Product Rule. Given fixed k € Zso and weighted sets S,5),...,5,, suppose every 
u € S can be constructed in exactly one way as follows. For 1 < 7 < k, choose u; € Sj 
of weight i;; choose an anagram w € R(1"2’2---k*); and assemble the chosen objects 


U1,.--+, Uk, w in a prescribed manner. If wt(w) = i) +--+ +%, = a , wt(u;) for allue S, 
then 
EGF(S; z) = EGF(S1; z) - EGF(S2; z)-...- EGF(S;; 2). 


This rule is useful for counting words with restrictions on the number of times each letter 
can be used. 


e Generating Functions for Stirling Numbers. The recursions for Stirling numbers lead to 
formal differential equations for the associated generating functions. Solving these gives 
the identities 


. CB phen = (1 a)-4 >. SB pean = ete, 


n=0 k=0 : n=0 k=0 
Hence, 7°", S(n,k)z”/n! = (e? — 1)*/k! for all k > 0. 


e Partition Generating Functions. Some generating function identities for integer partitions 


include: 
S- pe) 2 lel =Tl; a = pi zie. 


wePar i=l wePar 
oo oo 
si 24! = J] —. =JJau+= Stale 
—_— y2k-1 . 
uweOddPar k=1 i=l weDisPar 


Sylvester’s bijection dissects the centered Ferrers diagram of w € OddPar into L-shaped 
pieces that give a partition in DisPar. Glaisher’s bijection replaces each part k = 2°c in 
a partition v € DisPar (where e > 0 and c is odd) by 2° copies of c, giving a partition in 
OddPar. 


e Pentagonal Number Theorem. Franklin proved 


[[a _ z') =14 S35 (-1)"[22Gr-V/? +4 eer 


i=1 n=1 
by an involution on signed partitions with distinct parts. The map moves boxes between 
the staircase at the top of the partition and the bottom row; this move cancels all partitions 
except the ones counted by the right side. Since J], ,(1—z’) is the inverse of the partition 
generating function, we deduce the partition recursion 
foe) 


p(n) = do (-1)*"" p(n — k(3k — 1)/2) + p(n — (3k + 1)/2)]. 


k=1 
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Exercises 


5-1. Let S = {nitwit, blubber, oddment, tweak}. (a) Find GF(S; z) if the weight of a word 
is its length. (b) Find GF(S; z) if the weight of a word is the number of distinct letters in 
it. (c) Find GF(S; z) if the weight of a word is the number of distinct vowels in it. 

5-2. Fix q € Zso, let S be the set of all words using the alphabet {1,2,...,q}, and let wt(u) 
be the length of u for u € S. (a) Find GF(S;z). (b) Thinking of GF(S; z) as an analytic 
power series, evaluate the series and state the radius of convergence. 

5-3. Let S = UP {1,2,...,n}”, and let wt(u) be the length of u for u € S. (a) Find 
F(z) = GF(S; z). What is the radius of convergence of F'? (b) Viewing F’ as a formal power 
series, find F’(z). 

5-4. Fix m > 0, let T be the set of nonempty subsets of {1,2,...,m}, and let wt(B) be the 
least element of B for B € T. Find GF(T; z) and then simplify the resulting series. 

5-5. Let S be the set of five-card poker hands. Define the weight of a hand to be the number 
of aces in the hand. Find GF(S; z). 

5-6. For fixed k € Zso, describe a weighted set S of combinatorial objects such that 
GH Si 2)= 5 at's” 

5-7. For fixed m € Zso, describe a weighted set S of combinatorial objects such that 
GF(S;z) =(1—2)~™. 

5-8. Let S be the set of all finite sequences of 0’s and 1’s, where the weight of a sequence is 
its length. Find EGF(S; z), evaluate the resulting series, and state the radius of convergence. 
5-9. (a) Let S = Zso, and let wt(n) = n for n € S. Find GF(S; z) and EGF(; z). For each 
series, simplify fully and state the radius of convergence. (b) Repeat (a) using S = Zso. 
(c) Repeat (a) using wt(n) = 3n. 

5-10. Suppose S is a weighted set with weight function wt : S — Zso and generating 
function G(z) = GF(S;z). How does the generating function change if we use the new 
weight wt’(wu) = a-wt(u) + b, where a,b € Zso are fixed? 

5-11. Find a weighted set S' such that EGF(S; z) has radius of convergence equal to zero. 


5-12. Use the Ratio Test or Root Test to determine whether each series converges. 
een 
din—0(3n)"/n” 
lee) 


1) Eco (1+ 1/n)?”" 


3. Find the radius of convergence of each complex power series. 


AaeT 


OU 
Dees hb ~ 
Ba 
3 
w 
R 
3 


Bees s 


din—o(2/n)” 

(e) Vn=o 2°"/(8n)! 

5-14. Let an = []¥_, (27 —1)/(2%) for n > 1. Find the radius of convergence of \y7~_, anz”. 
5-15. Find the radius of convergence of J(z) = 7729 eee 

5-16. For fixed k € Zyo and b € Cyo, find the radius of convergence of each complex power 
series. 

(a) 57g toe" 

(b) So, yn’ bP 2" /nl 
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(cl > 9 WO a nl 

(d) Vento (n!)*2"/(kn)! 

5-17. In Example 5.10, verify the power series representations for sin z, cos z, sinh z, and 
cosh z by taking derivatives. 

5-18. (a) Use power series to prove e’* = cos z + isin z for all z € C. (b) Use power series 
to prove cos z = (e’* + e'*)/2 for all z € C. (c) Find and prove a formula for sin z similar 
to the one in (b). 


5-19. Define formal versions of e”, sin z, cos z, sinh z, cosh z, and Log(1+ z) by viewing the 
series expansions in §5.3 as formal power series. Compute the formal derivatives of these 
six formal power series. 


5-20. Use the Extended Binomial Theorem to write the first seven terms of each power 
series: (a) (1 — z)—*; (b) (1+ 2)*/*: (c) (1 —82)1/?; (d) A+ 22)?4. 

5-21. Expand each function as a power series and determine the radius of convergence: 
(a) 2(1—32)~7; (b) VI+z; () 1/VI—2 (A) (1+ 52)"°. 

5-22. Prove (1+ z)7V/? = 7°, (7")(—2/4)” for |z| < 1. 

5-23. Find the coefficient of z!° in each power series: (a) (1—z)~3; (b) (1+2z)~°; (c) (1-27); 
(d) 1/(1 — 7z + 1027). 

5-24. Suppose F’ and G are analytic on D(0;R), and define P(z) = F(z)G(z) for z € 
D(0; R). Use the Product Rule for Derivatives and induction to prove that for all n € Zo, 


PM (z) = 3 . FE) (z)GO-® (2), 


k=0 


5-25. Prove parts (a), (c), (d), and (g) of Theorem 5.14. 

5-26. Suppose F(z) = °°.) anz” for z € D(0;R). Find power series representations for 
all antiderivatives of F', which are functions G : D(0; R) > C satisfying G’ = F. 

5-27. (a) Given F(z) = 7°, an2", what is the coefficient of z” in F(z)? (b) Let zD 
denote the operation of taking a derivative with respect to z and then multiplying by z. 
What is the coefficient of z” in (zD)*(F)? 


5-28. (a) What: is the sequence of coefficients corresponding to the formal power series z*? 
(b) Verify that the product of the formal power series z* and z™ is z*+™. (c) Verify that 
parts (c) and (d) of Definition 5.15 are special cases of part (e) of that definition. 


5-29. Suppose F and G are formal power series, and c is a scalar. Prove the following 
rules for formal derivatives. (a) The Sum Rule: (F' + G)! = F’ + G’. (b) The Scalar Rule: 
(cF’)! = c(F’). (c) The Product Rule: (F'-G)’ = F’-G+F-G". (d) The Power Rule: For 
n € Zs, (F")! =nF"'. F'. 

5-30. Consider the formal power series F(z) = 1 — bz and G(z) = S77.) bz”. Use the 
definition of multiplication of formal power series to confirm that F(z)G(z) = 1. This 
justifies the notation G(z) = 1/(1 — bz). 

5-31. Solve each recursion using generating functions. (a) a) = 2a,-1 for n > 1 and ap = 1. 
(b) dy = 2a,—-1 for n > 1 and ap = 3. (c) an = 2a,-1 +1 for n > 1 and ap = 0. 

5-32. (a) Given a, = 3a,_-1 for n > 1 and ao = 2, solve for a,. (b) Given ay = 3a,-1 + 3n 
for n > 1 and ao = 2, solve for ay. (c) Given a, = 3a,-1 + 3” for n > 1 and ap = 2, solve 
for Gn. 

5-33. Solve the recursion a, = 6dan,—-1 — 8an—-2 + g(n) for n > 2 with initial conditions 
ao = 0, a, = 2 for the following choices of g(n): (a) g(n) = 0; (b) g(n) = 1; (c) g(n) = 2”; 
(d) g(n) = na”. 
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5-34. Define ap = 1, a) = 2, and ay = an_1 + 6an_2 for n > 2. Use generating functions to 
solve for ay. 

5-35. Define ap = 3, a, = 0, and a, = 2ay_-1—an—2—n for n > 2. Use generating functions 
to solve for an. 

5-36. Define ag = 0, a; = 1, and a, = 5an_1 — 6an—2 + 2” for n > 2. Use generating 
functions to solve for ay. 


5-37. Define ap = b, a1 = c, and ap, = —6an—1 — 5an—2 + (—1)” for n > 2. Use generating 
functions to solve for ap. 

5-38. Define Fibonacci numbers by fo = 0, fi = 1, and fn = fn—1+ fn—2 for all n > 2. Use 
generating functions to derive a closed formula for fp. 

5-39. Define Lucas numbers by setting Lo = 1, Ly = 3, and L, = Ly_, + Ln—z for all 
n > 2. Use generating functions to find a closed formula for Ly. 

5-40. Solve the recursion a, = —3ayn_1 + 2an_2 + 6an_3 — Gn—4 — 3dn_—5 for n > 5 with 
initial conditions a; =k for0<k< 5. 

5-41. Repeat the previous exercise with initial conditions a, = 3 forO<k <5. 

5-42. Use generating functions to solve the recursion a, = ba,—, + c and initial condition 
ao = d, where c #0 and bF 1. 

5-43. Use generating functions to solve the recursion a, = ba,—, -+nb” and initial condition 
ao = c, where b # 0. 

5-44. Use generating functions to prove Theorem 2.66. 


5-45. Use generating functions to compute the following sums. (a) 779 k?; (b) peg k5*: 
(c) De=o (3°): 

5-46. For fixed r # 1, compute >>) _4(k + 1)r* with generating functions. 

5-47. Evaluate )7y\_, k’. 

5-48. Prove > 
5-49. Let Fi ia, Bak. Find the generating function for each sequence (s,,). 

(a) $n = dk=0 ax(—1)"~ 

(b) sp) = dok=0 ar/(n—k)! 

(c) 8) = Sot_g(n — k)Pa 

(d) 8) = ere ae Aj; Ak 

5-50. Find )7y_4(—1)*k?. 

5-51. Use generating functions to find )\7_ 3*/(k!(n — k)!). 

5-52. Suppose bp = 1 and bp = bo +b) +--+ + bn-1 +1 for all n> 1. Find 774 bnz”. 
5-53. Suppose (c, : n € Z) satisfies co = 0, cy = 1, and cy = (Cn—1 + Cn4i)/L for all n € Z, 
where L € Ryo is a constant. Find an explicit formula for c,. 


5-54. Sums of Powers of Integers. Recall the notation (a){n= a(a+1)(a+2)---(at+tn—1). 
Prove: for all k,n € Zyo, 


it jon C2iC2j = 4"Cn, where C, denotes a Catalan number. 


k : 
L S(k+1,j+1 ios 
IF + ak 4..-4+nk=)5 ORE y(n tye 


5-55. Suppose f : S — T is a bijection such that for some k € Zyo and all u € S, 
wtr(f(u)) = wts(u) + k. How are GF(S; z) and GF(T; z) related? 
5-56. Fix a positive integer m, and let S' be the set of all subsets of {1,2,...,m}. (a) Define 
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wt(B) = |B| for B € S. Use the Product Rule for Weighted Sets to show that GF(S; z) = 
(1+ z)™ (cf. Example 5.3). (b) Now define wt(B) = }0,-,% for B € S. Find GF(S; z). 
5-57. (a) Let S be the set of m-letter words using the alphabet {0,1,...,k — 1}. Let the 
weight of a word be the sum of all letters in the word. Find GF(S; z). (b) Repeat (a) using 
the alphabet {1,2,...,k}. 

5-58. Find a formula for Dear Gnz", where a, is the number of positive integer solutions 
to #4, +%2+2%3+2%4+25 =n such that 71, 73, and x5 are odd, rg > 4, and x3 < 9. 

5-59. An urn contains r red balls, b blue balls, and w white balls. Let S be the set of all 
multisets of balls that can be drawn from this urn, weighted by the number of balls in the 
multiset. Find GF(S; z). 

5-60. Suppose we distribute 30 identical lollipops to 10 children so that each child receives 
between 2 and 4 lollipops. How many ways can this be done? 

5-61. A candy store has 6 chocolate truffles, 8 coconut creams, 5 caramel nougats, and an 
unlimited supply of cashew clusters. (a) How many ways can a box of ten pieces of candy 
be chosen from this inventory? (b) Repeat (a) assuming there must be at least one of each 
type of candy. 

5-62. Let a, be the number of ways to pay someone n cents using pennies, nickels, dimes, 
and quarters. Find a formula for $7°° 9 anz”. 

5-63. Suppose we have three types of stamps that cost r cents, s cents, and ¢ cents (re- 
spectively). Find the generating function for a, the number of ways to pay n cents postage 
using a multiset of these stamps. 

5-64. State and prove versions of the Sum Rule and Bijection Rule for EGFs. 


5-65. Find an EGF for words in the alphabet {1,2,...,4} where every symbol in the 
alphabet is used at least twice. 

5-66. Find an EGF for words in the alphabet {1,2,...,4} where every symbol is used 
between three and eight times. 

5-67. Find an EGF for words in the alphabet {a, b,...,z} where each vowel can be used at 
most twice. 

5-68. How many n-digit strings using the symbols {0,1,2,3} have an odd number of 0’s 
and an odd number of 1’s? 

5-69. Count the n-digit strings using the symbols {0,1, 2,3} in which the total number of 
0’s and 1’s is odd. 

5-70. An ordered set partition of X is a list (Bi,...,B,) of distinct sets such that 
{B,,..., By} isa set partition of X. Fix k > 0, and let S' be the set of ordered set partitions 
of {1,2,...,n} (for varying n > k) into exactly k blocks where every block has odd size. 
Find the EGF for S. 

5-71. Repeat the previous exercise assuming the size of each block is even (and positive). 
5-72. Let S be the set of k-element subsets of Z>0, weighted by the largest element in S. 
(a) Find GF(S;z). (b) What happens if we weight a subset by its smallest element? 

5-73. Fix k © Zo. Use the Sum and Product Rules for Weighted Sets to find the generating 
function for the set of all compositions with k parts, weighted by the sum of the parts. 
5-74. Compute the images of these partitions under Sylvester’s Bijection 5.50: 

(a) (15, 52,37, 19); (b) (75,39, 1); (c) (11,7,5,3); (a) (9%); (e) (2n — 1,2n —3,...,5,3)1). 
(The notation a? denotes b parts equal to a.) 


5-75. Compute the images of these partitions under the inverse of Sylvester’s Bijection. 
(a) (15, 12, 10, 8,6, 3, 1) 
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(b) (28, 12, 7,6, 2,1) 

(c) (11,7, 5,3) 

CEE TG 8320 
(e) (n,n —1,...,3,2,1) 


(n, 
-76. Sais the images of these partitions under Glaisher’s Bijection 5.51: 
a) (9,8,5,3,1); (b) (28, 12, 7,6,2,1); (c) (11,7, 5,3); (d) (21,17, 16, 13, 11, 9,8, 3, 2,1). 
5-77. Compute the images of these partitions under the inverse of Glaisher’s Bijection: 
(a) (15,57,37, 1"); (b) (i3*, 11°, 77,5, 3°, 1°) (c) (11,.7,5,3); Gd) (9*); (6) (1). 
5-78. Which partitions map to themselves under Glaisher’s Bijection? Which partitions 
map to themselves under the generalized bijection in Theorem 5.52? 
5-79. Let H and K be the bijections in the proof of Theorem 5.52. (a) Find 
H((25,.17, 17,10, 9,6, 6, 5,.2,2)) for d= 3,4, 5, (b) Find K((8"", 77, 2°°, 1°°)) for d = 3,5, 6. 
5-80. Calculate the image of each partition under Franklin’s Involution (§5.17): 
(a) (17,16, 15, 14,13, 10,8, 7,4); (b) (17, 16, 15, 14, 13, 10,8); (c) (n,n —1,...,3,2,1); 
(d) (n). 
5-81. Find the generating function for the set of all integer partitions that satisfy each 
restriction below (weighted by area): (a) all parts are divisible by 3; (b) all parts are distinct 
and even; (c) odd parts appear at most twice; (d) each part is congruent to 1 or 4 mod 7; 
(e) for each 7 > 0, there are at most i parts of size 7. 


5- 82. Give combina interpretations for the coefficients in the following formal power 
series: (a) [],5,(1— 2°}; (b) FI jso(1 + 20*1)(1+ 2); (¢) TIisa(1— 2 + 24)/(1— 2%). 

5-83. es ) Show that the first Hoses Hamamujen Identity (see Remark 5.53) can be written 
6 Coal = =1+ ae 1 oe (b) Find a similar formulation of 


the second Rogers-Ramanujan Identity. (c) Verify the Rogers-Ramanujan Identities for 
partitions of n = 12 by explicitly listing all the partitions satisfying the relevant restrictions. 


5-84. Prove by induction that a nonempty full binary tree with a leaves has a — 1 non-leaf 
vertices. 


5-85. Let f be the bijection in Figure 5.1. Compute f(T), where T is the binary tree in 
Figure 2.11. 


5-86. Let g be the bijection shown in Figure 5.2. Verify that the number of vertices in g(t) 
equals the number of leaves in t, for each full binary tree t. 


5-87. (a) Describe the inverse of the bijection g shown in Figure 5.2. (b) Calculate the 
image under g~! of the ordered tree shown here. 


5-88. List all full binary trees with five leaves, and compute the image of each tree under 
the map g in Figure 5.2. 


5-89. Give an algebraic proof of Theorem 5.52 using formal power series. 


5-90. (a) Carefully verify that the maps H and K in 5.51 are two-sided inverses. (b) Repeat 
part (a) for the maps H and K in Theorem 5.52. 


5-91. (a) Verify that the partition (2n,2n—1,...,n+1) (one of the fixed points of Franklin’s 
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Involution) has area n(3n + 1)/2. (b) Verify that the partition (2n — 1,2n — 2,...,n) has 
area n(3n — 1)/2. 


5-92. (a) Find the generating function for the set of all Dyck paths, where the weight of a 
path ending at (n,n) is n. (b) A marked Dyck path is a Dyck path in which one step (north 
or east) has been circled. Find the generating function for marked Dyck paths. 


5-93. Recall that 7°, Wr, Seek 2” = et"). Use partial differentiation of this gen- 
— = n. 

erating function to find generating functions for: (a) the set of set partitions where one 

block in the partition has been circled; (b) the set of set partitions where one element of 


one block in the partition has been circled. 


5-94. Let S be the set of paths that start at (0,0) and take horizontal steps (right 1, up 
0), vertical steps (right 0, up 1), and diagonal steps (right 1, up 1). By considering the final 
step of a path, find an equation satisfied by G(z) = GF(S;z) and solve for G(z), taking the 
weight of a path ending at (c,d) to be: (a) the number of steps in the path; (b) c+; (c) c. 
5-95. For fixed k > 1, find the generating function for integer partitions with: (a) k nonzero 
parts; (b) & nonzero distinct parts. (c) Deduce summation formulas for the infinite products 
The, - 2*)7? and T]2, (1 + 2’). 

5-96. A ternary tree is either @ or a 4-tuple (¢,t1,t2,t3), where each t; is a ternary tree. 
Find an equation satisfied by the generating function for the set of ternary trees, weighted 
by number of vertices. 


5-97. Let S be the set of ordered trees where every vertex has at most two children, weighted 
by the number of vertices. (a) Use the Sum and Product Rules to find an equation satisfied 
by GF(S;z). (b) Solve this equation for GF(S;z). (c) How many trees in S have seven 
vertices? 


5-98. Prove that the number of integer partitions of n in which no even part appears more 
than once equals the number of partitions of n in which no part appears four or more times. 


5-99. Prove that the number of integer partitions of n that have no part equal to 1 and no 
parts that differ by 1 equals the number of partitions of n in which no part appears exactly 
once. 


5-100. (a) Find an infinite product that is the generating function for integer partitions 
with odd, distinct parts. (b) Show that the generating function for self-conjugate partitions 
(i.c., partitions such that 4’ = A) is 1+ Oe, 2 /((1 — 2?)(1 — 24) --- (1 — 24). (c) Find 
an area-preserving bijection between the sets of partitions in (a) and (b), and deduce an 
associated formal power series identity. 


5-101. Evaluate 77°, 2*(1— z)*. 

5-102. How many integer partitions of n have the form ((i + 1)*, #4) for some i, j,k > 0? 
5-103. Dobinski’s Formula. Prove that the Bell numbers (see Definition 2.44) satisfy 
Bla) =e yo B/k!) for aS 0. 

5-104. (a) Use an involution on the set Par x DisPar to give a combinatorial proof of the 
identity []7, p4e 12 (1 - 2”) = 1. (b) More generally, for S$ C Zyo, prove combinato- 
rially that [Jpeg te Ineg(1- 2”) = 1. 


5-105. Let d(n,k) be the number of derangements in S,, with k cycles. Find a formula for 


oo Tu (nk) ik yn 
nb k=0 n! 2". 


5-106. Let Si(n,k) be the number of set partitions of {1,2,...,n} into k blocks where no 


block consists of a single element. Find a formula for 7° 4 op_—9 Silek) gh gn 


5-107. Let Par(n) be the set of integer partitions of n. Show that, for all n > 0, 


(—1)"| OddPar N DisPar M Par(n)| = |{y € Par(n) : £() is even}| — |{u € Par(n) : (2) is odd}I. 
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5-108. Involution Proof of Euler’s Partition Recursion. (a) For fixed n > 1, prove 
Dyez(—-l) p(n — (3j? + j)/2) =0 by verifying that the following map J is a sign-reversing 
involution with no fixed points. The domain of J is the set of pairs (j, A) with 7 € Z and 
dA € Par(n — (37? + j)/2). To define I(j,X), consider two cases. If £(A) + 37 > A1, set 
I(j,) = (¢ — 1, ») where p is formed by preceding the first part of A by (A) +37 and then 
decrementing all parts by 1. If €(A) + 37 < Ai, set I(j,A) = (f + 1,v) where v is formed 
by deleting the first part of A, incrementing the remaining nonzero parts of A by 1, and 
appending an additional A; — 37 — €(A) — 1 parts equal to 1. (b) Given n = 21, j = 1, 
A = (5,5,4,3, 2), compute I(j, A) and verify that I(I(j, A)) = (, A). 

5-109. We say that an integer partition \ extends a partition yu iff for all k, k occurs 
in A at least as often as k occurs in p; otherwise, A avoids py. Suppose fies a ee 1} 
and {v' : i > 1} are two sequences of distinct, nonzero partitions such that for all finite 
SC Zso, Dies |u| = Vies ||. (a) Prove that for every n, the number of partitions of n 
that avoid every py’ equals the number of partitions of n that avoid every v*. (b) Show how 
Theorem 5.52 can be deduced from (a). (c) Use (a) to prove that the number of partitions 
of n into parts congruent to 1 or 5 mod 6 equals the number of partitions of n into distinct 
parts not divisible by 3. 

5-110. Prove that absolute convergence of a complex series implies convergence of that 
series. 

5-111. Prove that if pol, Cn exists in C, then lim,-..5 Cn = 0. Is the converse true? 
5-112. Comparison Test for Nonnegative Series. Suppose an, bn € Rso satisfy an < 
bn for all n > no. Prove: if S°°° 9 bn converges, then )>~° 9 an converges. [Hint: Use the 
Monotone Convergence Theorem, which states that a weakly increasing sequence of real 
numbers that is bounded above converges to a real limit.] 

5-113. Prove the Ratio Test. [Hint: If L < 1, compare |c,| to an appropriate geometric 
series. 

5-114. Prove the Root Test. [Hint: If L < 1, compare |c,| to an appropriate geometric 
series. 

5-115. (a) Show that S7°° , 2°"/(3n)! = (1/3)e* + (2/3) cos(zV3/2)e—*/?. (b) Find similar 
formulas for )77°_, 22°71 /(8n + 1)! and S77, 22"*?/(38n + 2)!. 

5-116. Use complex power series to prove: for all z,w € C, e7*” = e¥e*. [Hint: Hold w € C 
fixed and view both sides of the identity as analytic functions of z. Show that the power 
series expansions of these two functions have the same coefficient of z” for every n.] 
5-117. For r € C, let G,(z) = (1 +2)" =e" be(4+) for z € D(0;1). (a) Prove by induction 
on n € Zyo that G,(z) equals the product of n copies of 1+ z. (b) Prove that for all 
n € Zso, G_n(z) equals the product of n copies of 1/(1+ z). 

5-118. Define G,(z) as in the previous exercise. (a) Prove: for all r,s € C and z € D(0;1), 
Gr4s(z) = G,(z)Gs(z). (b) Prove Gi.(z) = r(1+z)"7!. (c) Prove G0” (z) = (Mn (L42)"—" 
for n € Zyo. 

5-119. For r in any commutative ring R containing Q, define a formal power series version 
of (1+ z)” by setting 


(14+2)"= a (\n on 
n=0 


(a) Verify that the formal derivative of (1 + z)" is r(1 + 2)"~'. (b) Compute the higher 
formal derivatives of (1+ z)”. 


5-120. Prove the formal power series identity (1+ z)"t* = (1+z)"(1+2z)*. [Hint: First find 
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a combinatorial proof of the identity (« + y)ln= peo (2) (z)Le (y)tn—k, Where n € Zs 
and x,y are formal variables.] 

5-121. (a) Prove: for all r € Zso, the formal power series (1+ 2)" equals the formal product 
of r copies of (1 +z). (b) Prove: for all r € Zo, the formal power series (1 + z)~" equals 
the formal product of r copies of 1/(1 +z) = 3>°.,(-1)"2". 

5-122. For any commutative ring R, let R{[z]] denote the set of formal power series with 
coefficients in R. Show that R[[z]] is a commutative ring using the operations in Defini- 
tion 5.15. 

5-123. Let F and G be formal power series. Prove that the formal derivative operation 
satisfies the Sum Rule (F'+ G)! = F’+G’ and the Product Rule (F'-G)' = F’.G+F.-G’. 

5-124. Given a formal power series F' = (a, : n > 0), let F(0) = ao be the constant 
coefficient of F’. Prove: for all n > 0, nla, = F™ (0). 

5-125. Recursion for Divide-and-Conquer Algorithms. Many algorithms use a 
divide-and-conquer approach in which a problem of size n is divided into a subproblems of 
size n/b, and the solutions to these subproblems are then combined in time cn” to give the 
solution to the original problem. Letting T(n) be the time needed to solve a problem of size 
n, T(n) satisfies the recursion T(n) = aT(n/b) + en® and initial condition T(1) = d (where 
a,b,c,d > 0 and k > 0 are given constants). Assume for simplicity that n ranges over powers 
of b. (a) Find a recursion and initial condition satisfied by S(m) = T(b™), where m ranges 
over Z>. (b) Use generating functions to solve the recursion in (a). Deduce that, for some 
constant C and large enough n, 


Cn if a < b* (combining time dominates); 
T(n)<% Cn*logsn_ if a = b* (dividing and combining times balance); 
Clee 4 if a > b* (time to solve subproblems dominates). 
5-126. Merge Sort. Suppose we need to sort a given sequence of integers 71,...,2n 


into increasing order. Consider the following recursive method: if n = 1, the sequence is 
already sorted. For n > 1, divide the list into two halves, sort each half recursively, and 
merge the resulting sorted lists. Let T(n) be the time needed to sort n objects using this 
algorithm. Find a recursion satisfied by T(n), and use the previous exercise to show that 
T(n) < Cnlog, n for some constant C. (You may assume n ranges over powers of 2.) 
5-127. Fast Binary Multiplication. (a) Given « = ak +b and y = ck +d (where 
a,b, c,d, k € Zso), verify that ry = (ak +b)(ck +d) = ack? + bd+((a+b)(c+d)—ac—bd)k. 
Take k = 2” in this identity to show that one can multiply two 2n-bit numbers by recursively 
computing three products of n-bit numbers and doing several binary additions. (b) Find a 
recursion describing the number of bit operations needed to multiply two n-bit numbers by 
the recursive method suggested in (a). (c) Solve the recursion in (b) to determine the time 
complexity of this recursive algorithm (you may assume n is a power of 2). 


DS 


Notes 


For more applications of generating functions to combinatorial problems, see [9, 121, 132]. 
Two older references are [85, 107]. For an introduction to the vast subject of partition 
identities, the reader may consult [5, 97]. Sylvester’s bijection appears in [123], Glaisher’s 
bijection in [47], and Franklin’s involution in [39]. The Rogers-Ramanujan identities are 
discussed in [101, 112]; Garsia and Milne gave the first bijective proof of these identities [43, 
4A). 
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Ranking, Unranking, and Successor Algorithms 


In computer science, one often needs algorithms to loop through a set of combinatorial 
objects, generate a random object from a set, or store information about such objects in 
a linear array. This chapter studies techniques for creating such algorithms. We begin by 
discussing ranking and unranking algorithms, which implement explicit bijections mapping 
combinatorial objects to integers and vice versa. Later we examine successor algorithms, 
which traverse all the objects in a given set in a particular order by going from the current 
object to the next object in a systematic way. We develop automatic methods for translating 
counting arguments based on the Sum Rule and Product Rule into ranking algorithms 
and successor algorithms. Combinatorial recursions derived using these rules can be treated 
similarly, leading to recursively defined ranking algorithms and successor algorithms. Before 
beginning, we introduce the following notation that will be used constantly. 


6.1. Definition: The Set [n]. For each positive integer n, let [n] denote the n-element 
set {0,1,...,2— 1}. 


DS 


6.1 Introduction to Ranking and Successor Algorithms 


This section introduces the general notions of ranking, unranking, and successor algorithms. 
Suppose S' is a finite set consisting of n objects. A ranking algorithm for S is a specific 
procedure implementing a particular bijection rk : S — [n]. This procedure takes an object 
x € S as input, performs some computation on x, and outputs an integer rk(x) called the 
rank of x. An unranking algorithm for S' is a specific procedure implementing a bijection 
unrk : [n] > S. Here the input is an integer 7 between 0 and n— 1, and the output unrk(j) 
is the object in S whose rank is 7. There can be many different ranking and unranking 
algorithms for a given set S', since there are many bijections between S and the set [n]. 
We can use ranking and unranking algorithms to solve the problems mentioned in the 
chapter opening, as follows. To loop through all objects in the n-element set S, loop through 
all integers 7 in the range 0 to n—1. For each such j, use an unranking algorithm to compute 
he object unrk(j) € S, and perform whatever additional processing is needed for this object. 
To select a random object from S, first select a random integer j € [n] (using any standard 
algorithm for generating pseudorandom numbers), then return the object unrk(j). To store 
information about the objects in S' in an n-element linear array, use a ranking algorithm 
to find the position rk(z) in the array where information about the object x € S should be 
stored. 

Successor algorithms provide another, potentially more efficient approach to the task 
of looping through all the objects in S. We begin with a particular total ordering of the 
n objects in S, say ao < a1 < +++ < %_1. A successor algorithm for S (relative to this 
ordering) consists of three subroutines called first, last, and next. The first subroutine 
returns the object 29. The last subroutine returns the object x, ;. The next subroutine 


cf 
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takes as input an object x; € S where 0 < i < n—1, and returns the object x;41 immediately 
following x; in the given total ordering of S. We can loop through all objects in S using the 
following pseudocode. 


x=first(S); 
process the object x; 
while (x is not last(S)) do 
{ x=next(x,S); 
process the object x; 


} 


Suppose we have built a ranking algorithm and an unranking algorithm for S, which 
implement a bijection rk : S' — [n] and its inverse unrk : [n] > S. We have an associated 
total ordering of S defined by setting unrk(0) < unrk(1) < --- < unrk(n — 1). We can 
use these algorithms to create a successor algorithm for S (relative to this ordering) as 
follows. Define first(S) = unrk(0); define last(S') = unrk(n— 1); and define next(z, 5’) = 
unrk(rk(x) + 1). Thus the problem of finding successor algorithms can be viewed as a 
special case of the problem of finding ranking and unranking algorithms. However, there 
may be more efficient ways to implement the next subroutine besides ranking x, adding 
1, and unranking the new value. In the second half of this chapter, we develop techniques 
for building successor algorithms that never explicitly compute the ranks of the objects 
involved. 


DT 


6.2. The Bijective Sum Rule 


We begin our study of ranking and unranking by revisiting the fundamental counting rules 
from Chapter 1. Our first rule lets us assemble ranking (or unranking) maps for two disjoint 
finite sets to obtain a ranking (or unranking) map for the union of these sets. 


6.2. The Bijective Sum Rule for Two Sets. Let S and T be disjoint finite sets. Given 
bijections f : S — [n] and g: T > [ml], there is a bijection h : SUT > [n + m] defined by 


f(x) ifae S; 
nie) = { g(x) +n ifeeT. 


The inverse of h is given by 


h~'(j) _— iQ) ifO<j<n; 
J giG-n) ifn<j<n+m. 


We sometimes use the notation h = f +g and h7! = f-!+g7!. 


We omit the detailed verification of this rule, except to remark that the disjointness of 
S and T is critical when showing that h is a well-defined function and that the claimed 
inverse of hf is injective. 

Observe that the order in which we combine the bijections makes a difference: although 
SUT=TUS andn+m=m-+n, the bijection f+g:SUT > [n+ m] is not the same 
as the bijection g + f : TUS > [m+n]. Intuitively, the ranking bijection f + g assigns 
earlier ranks to elements of S (using f to determine these ranks) and assigns later ranks to 
elements of T (using g); g + f does the opposite. Similarly, the unranking map f~' + g~1 
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generates a list in which elements of S occur first, followed by elements of T; g~! + f7! 
lists elements of T first, then S. 

Iterating the Bijective Sum Rule for Two Sets leads to the following general version of 
this rule. 


6.3. The Bijective Sum Rule. Assume $},...,5, are pairwise disjoint finite sets with 
|S;| = n;, and suppose we are given bijections f; : 5; > [n;] for 1 < i< k. Let S = 
Sy U-+-US, and n = |S| =n, +---+ ng. There is a bijection f : S > [n] defined by 


f(x) = fila) + Son; for x € Sj. 


j<i 


For j € [n], we may compute f—'(j) as follows. There exists a unique i such that 1 <i<k 
and ny +---+7j-1 <j < ny +---+n;; we have 


fg) = iG — [ny +--+ + nj4-1]). 
We introduce notation f = . f, and f~! = es tae 


As before, we omit the formal proof of the Bijective Sum Rule. One can give a direct 
proof that the maps in question are bijections, or use an induction argument involving the 
Bijective Sum Rule for Two Sets to show that yy i= ny fi + fe. Compare to §1.15. 

Intuitively, f; +---+ fx is the ranking map that assigns elements of 5; to positions 0 
through n,; — 1 using f1, assigns elements of S2 to positions n; through n, + ng — 1 using 
fo, etc. The unranking map f>' +---+ re generates a listing of S' in which objects in 5; 
occur first, then objects in Sz, and so on. 

For programming purposes, it is helpful to have a pseudocode version of the Bijective 
Sum Rule. Here and below, we use notation such as rank(x,S_i) to denote the ranking 
function for the set 5; applied to the input object xz. The pseudocode appears in Figure 6.1. 
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6.3. The Bijective Product Rule for Two Sets 


Our next goal is to develop a bijective version of the Product Rule that can be used to 
build ranking and unranking maps. We approach the full rule gradually through a sequence 
of lemmas. The reader may already know the following theorem about integer division: for 
all integers a and m with m > 0, there exist unique integers q and r with a = qm+r and 
O<r<m. We call q and r the quotient and remainder when a is divided by m, writing 
q=adivm andr =a mod m. Our first lemma is a bijective reformulation of integer division 
that is a key ingredient in the creation of ranking and unranking algorithms. 


6.4. Lemma. For all n,m € Zso, there is a bijection ppm : [nr] x [m] > [nm] defined by 
Pnm(t,j) =im+ y for alli € [n] and j € [ml]. 


For all a € [nm], 
Pn m(@) = (adivm,a mod m). 


Proof. If one already knows the Integer Division Theorem quoted above, then the surjectiv- 
ity and injectivity of ppm quickly follow from the existence and uniqueness of the quotient 
and remainder when a is divided by m. Here we give a different proof in which we appeal to 
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Assumptions: S is the disjoint union of finite sets S_1,...,S_k; for all i, 
we already know |S_il and ranking and unranking functions for S_i. 


define procedure rank(x,S): 
{ value=0; 
for i=1 to k do 
{ if (x is in S_i) then return valuetrank(x,S_i); 
else value=value+|S_il; 
} 
return "error: x is not in 8"; 


} 


define procedure unrank(j,S): 
{ for i=1 to k do 
{ if (0<=j<|S_il) then return unrank(j,S_i); 
else j=j-|S_il; 
} 
return "error: j is outside the valid range for 8"; 


} 


FIGURE 6.1 
Pseudocode for the Bijective Sum Rule. 


the Bijective Sum Rule to construct the bijection p,m automatically. For each 7 € [n], de- 
fine a set S; = {i} x [m]. Also define a ranking bijection f; : 5; > [ml] by letting fi(i, 7) = 7 
for all j € [ml]. Since S x T is the union of the pairwise disjoint sets Sp,..., S,—1, the Bijec- 
tive Sum Rule assembles these bijections to produce a bijection f = 7""p fi from [n] x [m] 
onto [nm]. To finish, we need only check that ppm (as defined in the lemma statement) 
equals the function f. Given (i, 7) € [n] x [ml], we calculate f(i, 7) by noting that (7,7) € S; 
and |.S;,| =m for all k <7, so 


FON =HEN+ SS [Sel =J+im =pam(i, J). 


0<k<i 


The Bijective Sum Rule guarantees the invertibility of f. Writing what this means in more 
detail, we see that any a € [nm] can be written in exactly one way in the form a = im+ J, 
where 7 € [n] and j € [mJ]. This is exactly the existence and uniqueness assertion of the 
Integer Division Theorem (for the specified range of a’s). So the claimed formula for [eae 
is well-defined and correct. O 


Notice that the previous proof implicitly contains an algorithm for computing the quo- 
tient and remainder when a is divided by m. Indeed, because am is the unranking map 
yo f; , we can use the algorithm in Figure 6.1 to find p,},(a) = (adivm,a mod m). 
Since each |.S;| = m in this case, the unranking algorithm proceeds by repeatedly subtract- 
ing m from a until a value less than m is reached. That value is the remainder a mod m, 
whereas the number of copies of m that we subtracted is the quotient adivm. This is none 
other than the iterated subtraction algorithm for performing integer division. We hasten to 
point out that other, much more efficient division algorithms exist (such as the long division 
technique learned in grade school). 
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Here is another technical point. Although multiplication of numbers is commutative, 
the maps py m and pm» are not equal when n 4 m, since their domains are not the same. 
Consider the bijection s : [In] x [m] > [m] x [n] given by s(7, 7) = (7,2). The maps ppm and 
Pm.n © § have the same domain but are still unequal, because ppm (i,j) = im +7, whereas 
Pm n(s(i,j)) = jn +1. We could have built the map pm,» os from the Bijective Sum Rule 
by writing [n] x [mJ] as the disjoint union of the slices [n] x {7}, for 7 € [m]. 

To use the maps py,m to create ranking algorithms for sets of combinatorial objects, we 
need the following version of the Bijection Rule. 


6.5. The Bijection Rule for Ranking and Unranking Maps. Suppose F’': X > Y is 
a bijection, and we know algorithms to compute F and F~!. If rk : Y > [n] is a ranking 
map for Y, then rkoF is a ranking map for X. If unrk : [n] > Y is an unranking map for 
Y, then F~! o unrk is an unranking map for X. 


Proof. The assertions follow immediately from the facts that the composition of bijections 
is a bijection, and the inverse of a bijection is a bijection. O 


We apply this rule in two stages to reach the Bijective Product Rule for Two Sets. 


6.6. The Bijective Cartesian Product Rule for Two Sets. Let S and T be finite sets. 
Given ranking bijections f : 5 — [n] and g: T — [ml], there is a bijection p: Sx T > [nm] 
defined by 

p(x, y) = f(x)m+g(y) forallae SandyeT. 


For all a € [nm], 
p‘(a) = (f7~'(adiv m), g~'(a mod m)). 


Proof. Define F: S x T — [n] x [m] by F(a,y) = (f(x), g(y)) for « € S and y € T. One 
readily checks that F is a bijection with inverse F~'(i,7) = (f~*(i),g7'(J)) for i € [n] 
and j € [ml]. We also have the bijection ppm : [In] x [m] — [nm] from the earlier lemma. 
Composing these bijections and using the formulas for py, and p,, an we obtain the formulas 
for p and p—' stated above. oO 


More generally, if f: S — A and g:T — B are bijections, the map F: Sx T+ Ax B 
sending (x,y) € S x T to (f(x), g(y)) is a bijection; we denote the map F' by f x g. 


6.7. The Bijective Product Rule for Two Sets. Suppose S and T are finite sets 
such that |S| = n, |T| =m, and we know ranking and unranking algorithms for S and T. 
Suppose X is a set such that every object « € X can be constructed in exactly one way by 
first choosing an object s € S, then choosing an object t € T, and finally assembling these 
objects via some bijection F : S x T + X. There is a ranking bijection rkx : X > [nm] 
given by 

rkx (a) = rkg(s)m +rkr(t) where F(s,t) =x. 


The corresponding unranking bijection is given by 
unrkx (a) = F(unrks(adiv m), unrkr(a mod m)) for a € [nm]. 


Proof. This follows from the Bijection Rule by composing the previously constructed rank- 
ing and unranking maps for S x T with the bijections F~! and F, respectively. O 


Figure 6.2 gives pseudocode for the Bijective Product Rule for Two Sets. 
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Assumptions: S is an n-element set, T is an m-element set; 
we know a bijection F:S x T -> X and its inverse G:X -> S$ x T; 
we know ranking and unranking functions for S and T. 


define procedure rank(x,X): 
{ compute (s,t)=G(x) ; 
return rank(s,S)*|T|+rank(t,T); 


} 


define procedure unrank(a,X): 
{ gq=a div |TI; 
r=a mod |T|; 
s=unrank(q,S); 
t=unrank(r,T) ; 
return F(s,t); 


FIGURE 6.2 
Pseudocode for the Bijective Product Rule for Two Sets. 


De 


6.4 The Bijective Product Rule 


This section extends the product rules from the previous section to products of k sets, for 
any fixed positive integer k. The key arithmetical ingredient is the following generalization 
of the product map prim. 


6.8. Lemma. For all n1,1n2,...,m~ € Zso, there is a bijection 


Pnyyna,....nk : [na] x [na] ee [nx] — [nine ae NK] 


given by 
k k 
Pnz,...mu (C1, C2; +++ Ck) = C1NgMg +++ Met CoNg +++ Net +-+CK-1NK+ CK = s Ci Il nN; 
4=1 jg=itl 


For all a € [ning---ng], the following algorithm computes p,),,,(a). For i looping 
from k down to 1, let rj; = a mod n;, and then replace a by adivn,; then py), (a) = 
(11, 72,--+5Tk): 

Proof. We use induction on k to show that pn, ,....n, 18 a bijection. When k = 1, p,», is the 
identity map on [71]. When k = 2, the result follows from Lemma 6.4. Now assume k > 2 
and the result is already known for products of k — 1 sets. Writing n = nin2---n, and 
n’ = ngn3°++Nz, observe that 


k k 
Dry ,..snp (Cly ++ +3 Ck) = ci(ng-+- ne) + Soe; II 5 


i=2 j=it+l1 
—_ / 
= cyn + Dno,...ny (C25 +++ Ch) 


, 


= Dnyynt (C1; Pro,...srp (C21+++5 Ch): 
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This means that py,,....n, is the composition of the two bijections 

idfni} XPno,....nx ? [ra] x ([ma] x +++ x [ra]) > [ra] x [rn] and pp, ne: [ra] x [n'] > Ir], 
6.1 


(6.1) 
SO Pnj,...,.nz 18 a bijection. 

We verify the algorithm for p, cae by showing that applying this algorithm to a = 
Pny,...ny(C1y+++,Ck) does recover (ci,...,Cx). Again we use induction on k. The case k = 1 is 
immediate. Fix k > 1, and assume we have already verified the correctness of the algorithm 
for pry. .n,_,- Observe that 


k k-1 
Diz, act (Cty ee) = ) Cj II nj = Nk ) Cj iii nj | +r. 
i=l i<j<k i=l i<j<k 


Since 0 < cy, < nx, this expression shows that the quotient and remainder when we divide 
a by ng are 


k-1 
adivng = 5 Cj II 1s = Did mip (Cigs005 Ch) 


i=1  i<j<k-1 


and amod nz, = Cr, respectively. So the first division step successfully recovers cz, and 
replaces a@ by Pny,....n,_,(C1;---;Ck—-1). By induction hypothesis, the remaining divisions 
compute 

Co otic (Pri peers Ne-1 (c1, Lees Ck-1)) = (Cig sey C1): 
So the algorithm for inverting pn, ....n, is correct. O 


6.9. Example. Suppose (11, 72,73,N4,N5) = (4,6,5,4,2), son = ninongnans = 960. 
Then 


p(3,1,0,2,1)=3-(6-5-4-2)+1- (5-4-2) +0- (4-2) +2: (2) +1 = 765. 


To compute p~1(222), first divide 222 by ns = 2 to get gs = 111 and rs = 0. Then divide 
111 by ng = 4 to get q = 27 and ry = 3. Then divide 27 by n3 = 5 to get gg = 5 and 
r3 = 2. Then divide 5 by nzg = 6 to get qq = 0 and rz = 5. Finally, divide 0 by n, = 4 to 
get q, = 0 and r; = 0. We conclude that p~!(222) = (0,5, 2, 3, 0). 


6.10. Remark. Here is another algorithm for computing py, (@) = (c1,-..,¢k), based 
on (6.1), that recovers the sequence (ci,...,cx) from left to right. First, divide a by 
ngn3+-+-nz to obtain a quotient q, and a remainder r;. Set cy = qi, and recover (c2,..., Cx) 


geesy 


consisting of k digits c; coming from the range [b] = {0,1,...,b— 1}. Taking b = 10, 
we recover the familiar decimal representation of nonnegative integers. The maps py, 
provide generalizations of these representations in which the allowable digits in position 7 


come from the set [n;], and the place value of position 7 is nj41-+-- Nx. 
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6.12. The Bijective Cartesian Product Rule. Suppose 51,..., 5, are finite sets, |S;| = 
n;, and f; : S; — [n;] is a ranking bijection for 1 < i < k. There is a bijection h : 
Sy xX -+++ x Sp — [n1- ++ ng] defined by 


h(a, can Xk) = Pny,...,nk (fi(x1), ides, fr (ve)) for all x; € S;. 


For all a € [ni --- ne], 


h-'(a) = (f,*(e1), vies tz, (Gi) where (c1,...,Ck) = ey a, kN 


parley) 


Proof. This follows from Lemma 6.8 and the Bijection Rule for Ranking Maps, since the 
map fi X---X fp: SiX+++x Sp > Ini] x---x< [ng] sending (a1,...,v%) to (f1(21),..-, fe(@r)) 
is a bijection. oO 


Finally, we can state the most general version of the Product Rule for ranking and 
unranking maps. 


6.13. The Bijective Product Rule. Suppose $),...,S, are finite sets with |S;| = nj, 
and we know ranking and unranking algorithms for each S;. Suppose X is a set such that 
every object x € X can be constructed uniquely by choosing x; € S$; for 1 <i < k and 
assembling these choices via some bijection F : Sy; x --- x S, — X. There is a ranking 
bijection rkyx : X + [ny ---+ nx] given by 


tkx (x) = pny,...n, (tks, (21),...,rkg,(@~)) where F(x1,..., 0%) =a. 
The corresponding unranking bijection acts on a € [ni --- 4] via 
unrkx (a) = F(unrkg, (c1),..., unrkg,(cz)), where (c1,...,¢%) = Ti (a). 


A pseudocode implementation of this rule appears in Figure 6.3. 


DS 


6.5 Ranking Words 


In the next several sections, we give sample applications of the Bijective Sum Rule and 
Bijective Product Rule by constructing ranking and unranking maps for various sets of 
combinatorial objects. 


6.14. Example: Four-Letter Words. Let S be the set of all four-letter words using the 
26-letter alphabet A = {a,b,...,z}. We can think of S as the Cartesian product A* = 
Ax Ax Ax A of four copies of A. Using the Bijective Cartesian Product Rule, we obtain 
a ranking map for S from a ranking map for A. The standard alphabetical ordering of 
letters induces a ranking bijection rk4 : A — [26] given by rk4(a) = 0, rka(b) = 1, ..., 
rk4(z) = 25. The corresponding ranking bijection for S is the map rkg : S — [264] given 
by 
rkg(wiwew3w4) = p26,26,26,26(rk (wi), rk (we), rk 4(w3), rk 4(wa)). 


For example, 
rks(goop) = p26,26,26,26(6, 14, 14, 15) = 6 - 26° + 14- 267 + 14. 26' + 15 = 115, 299; 


rkg(pogo) = p26,26,26,26(15, 14,6, 14) = 15 - 26° + 14 - 267 + 6- 261 + 14 = 273, 274. 
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Assumptions: We know a bijection F:S_1 x ... x S_k -> X and its inverse 
G:X -> S_1 x ... x S_k; for all i, we know |S_il and 
ranking and unranking functions for the finite set S_i. 


define procedure rank(x,X): 
{ compute (x_1,...,x_k)=G(x); 
prod=1; 
value=0; 
for i=k downto 1 do 
{ value=value + rank(x_i,S_i)*prod; 
prod=prod*|S_il; 
} 
return value; 


} 


define procedure unrank(a,X): 
{ for i=k downto 1 do 
{ q=a div |S_il; 
r=a mod |S_il; 
x_i=unrank(r,S_i); 
a=q; 
} 
return F(x_1,...,x_k); 


} 


FIGURE 6.3 
Pseudocode for the Bijective Product Rule. 


To compute the associated unranking map unrkgs : [264] > S on a € [26*], first find 
(C1, €2,¢3,C4) = a6, 26,26,26(2)- Note that c,,...,¢4 are the digits in the base 26 represen- 
tation of a. Return the answer unrkg(a) = unrk,(c;) unrk,4(c2) unrk4(c3) unrk 4(c4). For 
example, to unrank 200, 000, we first calculate p3¢!26,26.26(200, 000) = (11,9, 22,8). Then 


unrkg (200, 000) = rky'(11) rk3* (9) rk" (22) rkg'(8) = ljwi. 


In general, if A is an m-letter alphabet with ranking map rk,4 : A > [mJ], then a ranking 
map for the set of k-letter words A* is given by 


k 


rk(w 1 we ++ Wk) = Pm,m,....m(tka(wi),..-,tka(we)) = So rka(wi)m**. 
i=1 


To unrank an integer a € [m*], find the base m representation (c1,...,Ck) = Pmt m(a) 


and then replace each digit c; by the letter rk (ci). 


6.15. Example: Three-Letter Words. Consider A = {a,b,c} and S = A®. Define 
tk4 : A > [8] by rka(a) = 0, rk4(b) = 1, and rka(c) = 2. For w = wiwew2 € S, we 
have 

rks(w) = 9rk4(wi) + 3rk4(we) + rk4(ws), 


where rk4(w;) is the ith digit from the left in the base 3 expansion of rks(w). Table 6.1 
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displays the 27 words in S$ and their corresponding ranks (given first in base 3 and then 
in base 10). Note that words are ranked in the same order that they would appear in a 


dictionary. 


aaa + 000=0 baa + 100=9 caa + 200=18 
aab © 001=1 bab ¢ 101=10 cab © 201=19 
aac © 002=2 bac © 102=11 cac © 202=20 
aba © 010=3 bba< 110=12 cba © 210=21 
abb © 011=4 bbb © 111=13 cbb © 211=22 
abe © 012=5 bbe © 112=14 che © 212=23 
aca © 020=6 bea & 120=15 cca & 220=24 
acb © 021=7 bcb © 121=16 ccb & 221=25 
acc © 022=8 bee @ 122=17 ccc & 222=26 


TABLE 6.1 
Ranking of three-letter words. 


More generally, a ranking map built using the Bijective Cartesian Product Rule will 
induce a certain lexicographic ordering of the set S = S, x +--+ x S;,. Informally, the ranking 
map rkg treats the first component of the k-tuple as most significant and the last component 
as the least significant. If we unrank 0,1,...,|S|— 1 in this order, we obtain a list that 
begins with all the k-tuples that have rks, (0) in the first component. Next we get all the 
k-tuples that have rks (1) in the first component, and so on. Each sublist is also arranged 
lexicographically based on the values in the remaining components of the k-tuples. 

We can describe the total ordering <jex induced by rkg more formally as follows. Each 
factor S; has a total ordering <; determined by the ranking map rkg, (namely, for all 
Li, Yi © Si, Ui <i yi iff rkg, (ai) < rkg,(y;)). It can be checked that for all a = (a1,..., 2%) 
and y = (y1,---;Yk) in S, @ <iex y iff ~ = y or a; <; y; for the least index i such that 


6.16. Example: Words with Restrictions. Let S be the set of four-letter words 
w,w2w3w4 that begin and end with consonants and have a vowel in the second po- 
sition. Choosing letters from left to right and using the Product Rule, we see that 
|S| = 21-5- 26-21 = 57,330. Let C, V, and A denote the set of consonants, vowels, and 
all letters, respectively. Using alphabetical order, we get ranking bijections rkc : C — [21], 
rky : V > [5], and rky : A > [26]. For example, 


rky(a) = 0, rky(e) = 1, rky (i) = 2, rky(o) = 3, rky(u) = 4. 


Since S = Cx V x Ax C, the Bijective Cartesian Product Rule provides a ranking map 
tkg : S + [57,330] given by 


rkg(w1w2w3wa) = por,5,26,21(tko(w1), rkv (we), rk 4(w3), rko(wa)). 
For example, 
rkg(host) = por.5,26,21(5, 3, 18, 15) = 5+ (5-26-21) +3- (26-21) + 18- (21) +15 = 15,681. 


We unrank by applying eecet and then decoding to letters. For example, repeated 
division shows that 
Po1.5,26,21 (44001) = (16, 0, 15, 6), 
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and therefore u(44001) = vapj. This unranking method generates the words in S$ in alpha- 
betical order. 


6.17. Example: License Plates. A California license plate consists of a digit, followed by 
three letters, followed by three digits. We can use the preceding ideas to rank and unrank 
license plates. For instance, 


r(3PZY292) = P10,26,26,26,10,10,10 (3, 15, 25, 24, 2, 9, 2) = 63, 542, 292. 


DS 


6.6 Ranking Permutations 


In the examples considered so far, the choices made at each stage of the Product Rule 
did not depend on what choices were made in previous stages. This section studies the 
more complicated case where the available choices do depend on what happened earlier. We 
illustrate this situation by solving the ranking and unranking problems for permutations. 

Suppose A is an n-letter alphabet. Recall that a k-permutation of A is a word w = 
Wy 1W2+++Wr, where the w;’s are distinct elements of A. Let S' be the set of all k-permutations 
of A. Using the ordinary Product Rule, we build elements w in S by choosing w; € A inn 
ways, then choosing w2 € A—{w;} in n — 1 ways, and so on. At the ith stage (where 
1 <i < k), we choose w; € A—{wi,we,...,wi-1} in n — (i — 1) ways. Thus, |S| = 
n(n—1)---(n—k+1). Notice that the set of choices available at the ith stage depends on 
the choices made earlier, but the cardinality of this set is independent of previous choices. 
This last fact is a key hypothesis of the Product Rule. 

Let us rephrase the preceding counting argument to obtain a bijection between S and 
the product set [n] x [n-—1] x --+ x Jn -—k +1]. Fix a total ordering (a9, 21,...,%n—1) 
of the letters in A; equivalently, fix an unranking bijection unrky : [n] > A and let x; = 
unrk (i). Suppose w = w1w2--:- we € S. We must map w to a k-tuple (1, jo,.--, J), where 
0< 4; <n—(i—1). To compute j;, locate w; in the sequence x = (2, 21,.--,2n—1), let j1 
be the number of letters preceding w, in the sequence, and then erase w, from the sequence 
to get a new sequence x’. To compute jo, find we in the sequence 2’, let jg be the number 
of letters preceding it, and then erase w2 to get a new sequence 2”. Continue similarly to 
generate j3,...,7~- This process is reversible, as demonstrated in the next example, so we 
have defined the required bijection. Combining this bijection (and its inverse) with the maps 
Pn n—1,....n—k+1 and oe eee tr we obtain ranking and unranking maps for S. It can be 
verified that these maps correspond to the alphabetical order of permutations determined 
by the given total ordering of the alphabet A. 


6.18. Example. Let n = 8, k = 5, and A = (a,b,c,d,e,f,g,h) with the standard alphabetical 
ordering. Let w = cfbgd € S. We compute (j1,..., 75) as follows: 


2 letters precede c in (a,b,c,d,e,f,g,h), so 71 = 2; 
4 letters precede f in (a,b,d,e,f,g,h), so Jo = 4; 
1 letter precedes b in (a,b,d,e,g,h), so j3 = 1; 
3 letters precede g in (a,d,e,g,h), so j4 = 3: 
1 letter precedes d in (a,d,e,h), so js =1. 


Thus, cfbgd maps to (j1,...,j5) = (2,4,1,3,1). The rank of the word cfbgd is therefore 


Ds.7,6,5,4(2, 4, 1,3, 1) =2-(7-6-5-4)+4- (6-5-4) 4+1-(5-4)+38- (4) +1 = 2193. 
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Next, let us unrank the integer 982. First, repeated division gives 
Ps, 7,6,5,4(982) = (1, 1, 1,0, 2). 


Since j; = 1, the first letter of the word must be b. Removing b from the alphabet gives 
(a,c,d,e,f,g,h). Since j2 = 1, the second letter of the word is c. Removing c from the previous 
list gives (a,d,e,f,g,h). Continuing in this way, we see that 982 unranks to give the word 
bedag. 


6.19. Example. Let S be the set of permutations of {1,2,3,4,5,6}. Using the procedure 
above to rank the permutation 462153, we first compute (j1,...,j6) = (3,4,1,0,1,0) and 
D5 6 ,4,3,2,1(397) = (3,1, 2, 0, 1,0). Then use these position numbers to recover the permuta- 
tion 425163. 


DS 


6.7 Ranking Subsets 


According to the Subset Rule in §1.4, the number of k-element subsets of an n-element set 
is (7) = oy This result was obtained indirectly, by enumerating k-permutations of an 
n-element set in two ways and then dividing the resulting equation by &!. In general, the 
operation of division presents serious problems when attempting to construct bijections. 
Therefore, we adopt a different approach to the problem of ranking and unranking subsets. 
Instead of using the Bijective Product Rule, we apply the Bijective Sum Rule to the recur- 
sion characterizing binomial coefficients. This leads to recursive algorithms for ranking and 
unranking subsets. 

Write C(n, k) for the number of k-element subsets of an n-element set. In Example 2.21, 
we saw that these numbers satisfy the recursion 


C(n,k) = C(n-—1,k)+C(n-1,k-1) forO0<k<n, (6.2) 


with initial conditions C(n,0) = C(n,n) = 1. This recursion came from a combinatorial 
argument involving the Sum Rule. Using the Bijective Sum Rule instead leads to recursively 
defined bijections for ranking and unranking. For each alphabet A, introduce the temporary 
notation S;,(A) to denote the set of all k-element subsets of A. We assume that all alphabets 
to be considered have been given some fixed total ordering that allows us to rank and unrank 
individual letters of the alphabet. Suppose A = {ao < a1 < +--+ < &n_—1} is such an alphabet 
with n letters. We can write S;,(A) as the disjoint union of sets T and U, where T consists 
of all subsets that do not contain x,_; and U consists of all subsets that do contain 7,,_1. 
Note that T = S;,(A—{xn_1}), and there is a bijection from U onto $,_1(A—{xn_1}) that 
acts by deleting «1 from a subset belonging to U. We can use recursion to obtain ranking 
and unranking maps for $;,(A—{2,-1}) and S,_1(A—{an_1}), as the members of these sets 
are subsets drawn from smaller alphabets. Then we combine these maps using the Bijective 
Sum Rule to get ranking and unranking maps for $;(A). 

By expanding the definitions, we arrive at the following recursive ranking algorithm for 
mapping a subset B € S;(A) to an integer: 


e Ifk =0 (so B=9), then return the answer 0. 


e If k > 0 and the last letter x in A does not belong to B, then return the ranking of B 
relative to the set S;(A—{2}), which we compute recursively using this very algorithm. 
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e Ifk > Oand the last letter x in A does belong to B, let i be the rank of B’ = B—{a} relative 
to the set S,-1(A—{x}) (computed recursively), and return the answer C(n — 1,k) +7. 
Note that C(n—1,k) can be computed using the recursion (6.2) for binomial coefficients. 


The inverse map is the following recursive unranking algorithm that maps an integer 
m € [C(n,k)]] to a subset B € S;,(A): 


e If k=0 (so m must be zero), then return 0. 


e Ifk >Oand0<m< C(n—1,k), then return the result of unranking m relative to the 
set S;,(A—{x}), where x is the last letter of A. 


e Ifk >Oand C(n—1,k) <m< C(n,k), then let B’ be the subset obtained by unranking 
m —C(n—1,k) relative to the set S,-1(A—{x}), and return B’ U {a}. 


6.20. Example. Let A = {a,b,c,d,e,f,g,h} ordered alphabetically, and let us rank the subset 
B = {c,d,f,g} © $4(A). Since the last letter of A (namely h) is not in B, we recursively 
proceed to rank B relative to the 7-letter alphabet A, = {a,b,c,d,e,f,g}. The new last letter 
g does belong to B, so we must add C(7 — 1,4) = 15 to the rank of B, = {c,d,f} as 
a three-element subset of {a,b,c,d,e,f}. The last letter f belongs to B,, so we must add 
C(6 — 1,3) = 10 to the rank of By = {c,d} as a two-element subset of {a,b,c,d,e}. Since 
e is not in By, this rank is the same as the rank of By relative to {a,b,c,d}, which is 
C(4— 1,2) =8 plus the rank of {c} as a one-element subset of {a,b,c}. In turn, this rank 
is C(3 — 1,1) = 2 plus the rank of @ as a zero-element subset of {a,b}. The rank of the 
empty set is 0 by the base case of the algorithm. Adding up the contributions, we see that 


the rank of B is 
3-1 cb 4—1] 4 6-1 “i 7-1 — 30 
1 2 3 7 


Generalizing the pattern in the previous example, we can convert the recursive algorithm 
to the following summation formula for the rank of a subset. 


6.21. Theorem: Sum Formula for Ranking Subsets. If A = {ap <7 <-:- < a@,_1} 
and B = {#i,,%i,,-..,%i, } where iy < ig < +++ < ix, then the rank of B as a member of 


S,(A) is ae RE 
This formula can be proved by induction on k. 


6.22. Example. Now we illustrate the recursive unranking algorithm. Let us unrank the 
integer 53 to obtain an object B € S4(A), where A = {a,b,c,d,e,f,g,h}. Heren = 8 and k = 4. 
Since C(7,4) = 35 < 53, we know that h € B. We proceed by unranking 53 — 35 = 18 to 
get a three-element subset of {a,b,c,d,e,f,g}. This time C(6,3) = 20 > 18, so g is not in 
the subset. We proceed to unrank 18 to get a three-element subset of {a,b,c,d,e,f}. Now 
C(5,3) = 10 < 18, so f does belong to B. We continue, unranking 18 — 10 = 8 to get a two- 
element subset of {a,b,c,d,e}. Since C(4,2) = 6 < 8, e € B and we continue by unranking 
2 to get a one-element subset of {a,b,c,d}. We have C(3,1) = 3 > 2, sod ¢ B. But at the 
next stage C(2,1) = 2 < 2, soc € B. We conclude, finally, that B = {c,e,f,h}. 


As before, we can describe this algorithm iteratively instead of recursively. 


6.23. Unranking Algorithm for Subsets. Suppose A = {2 < 1 < ++: < &@,_-1} and 
we are unranking an integer m to get a k-element subset B of A. Repeatedly perform the 
following steps until k becomes zero: let i be the largest integer such that GO < m; declare 
that x; € B; replace m by m — ea) and decrement k by 1. 
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We close with a remark about the ordering of subsets associated to the ranking and 
unranking algorithms described above. Let x be the last letter of A. If we unrank the 
integers 0,1,2,... in this order to obtain a listing of $;,(A), we will obtain all k-element 
subsets of A not containing «x first, and all k-element subsets of A containing « second. Each 
of these sublists is internally ordered in the same way according to the next-to-last letter 
of A, and so on recursively. In contrast, if we had applied the Bijective Sum Rule using the 
recursion 


O(n, k) = C(n-—1,k-1)+C(n—-1,k) 


(in which the order of the summands is swapped), then the ordering rules at each level 
of this hierarchy would be reversed. Similarly, the reader can construct variant ranking 
algorithms in which the first letter of the alphabet is considered most significant, etc. Some 
of these variants are explored in the exercises. 
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6.8 Ranking Anagrams 


Next we study the problem of ranking and unranking anagrams. Recall that R(aj! --- az") 


is the set of all words of length n = n1 +---+n, consisting of n; copies of a; for 1 <i<k. 
By the Anagram Rule, we know these sets are counted by the multinomial coefficients: 


n n! 
R WL cae Mk = = . 
oc ax") | (ms. ..-m) nz!noq!--- nz! 


There are at least three ways of deriving this formula. One way counts permutations of 
n distinct letters in two ways, and solves for the number of anagrams by division. This 
method is not easily converted into a ranking algorithm. A second way uses the Product 
Rule, choosing the positions for the n, copies of a,, then the positions for the nz copies of 
az, and so on. Combining the Bijective Product Rule with the ranking algorithm for subsets 
presented earlier, this method does lead to a ranking algorithm for anagrams. A third way 
to count anagrams involves finding recursions satisfied by multinomial coefficients (§2.8). 
This is the approach we pursue here. 

Let C(n;n1,...,n%) be the number of rearrangements of n letters, where there are n; 
letters of type i. Classifying words by their first letter leads to the recursion 


k 
C(nym,...,m%) = >> C(m—1jym,...,mi—1,..., 08). 


i=1 


Applying the Bijective Sum Rule to this recursion, we are led to recursive ranking and 
unranking algorithms for anagrams. 
Here are the details of the algorithms. We recursively define ranking maps 


Tny,....n~ 2 R(az* -+- az”) > [M], 


where M = (nj+---+nx)!/(mi!--+ng!). any n; is negative, rp, ,...n, is the empty function. 
If all n,’s are zero, Tn,,....n, iS the function sending the empty word to 0. To compute 
Tny,...n,(W) in the remaining case, suppose a; is the first letter of w. Write w = aj;w’. 
Return the answer 


Tny,....n,(W) = SS C(n—1;11,...,j —1,..., 1) + Pny,....ne—1,....n4(W’), 
j<i 
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where the rank of w’ is computed recursively by the same algorithm. 
Next, we define the corresponding unranking maps 
Uny,....n, : LM] > R(ap* ---a;*). 
Use the only possible maps if some n; < 0 or if all n; = 0. Otherwise, to unrank s € [MJ], 
first find the maximal index i such that n; > 0 and Laer, C(n—1;11,...,nj7-1,...,nk) <8; 
let s’ be the difference between s and this sum. Recursively compute the word 


w = Uny,..., ni—l,...,nrp is’), 


and return the answer w = a;w’. This unranking algorithm induces a listing of the anagrams 
in R(aj}* ---a,*) in alphabetical order relative to the alphabet ordering a; < az <--- < ax. 


6.24. Example. Let us compute the rank of the word w = abbcacb in R(a?b*c?); here 
n= 7, ny, 2, n2 3, and n3 = 2. Erasing the first letter a, we see that the rank of w 
equals zero plus the rank of w, = bbcach; now n = 6, n; = 1, ng = 3, and n3 = 2. Erasing 


b, we must now add (6 oa = 10 to the rank of we = beach; now n = 5, ny = 1, ng = 2, 


and n3 = 2. Erasing the next b, we must add faa) = 6 to the rank of w3 = cacb; now 
n=4,n, =1, ng = 1, and n3 = 2. Erasing c, we must add ts a) + G ea) = 6 to the rank 


of wa = acb; now n = 3, nj = 1, ng = 1, and n3 = 1. Continuing in this way, we see that 
the rank of acb is 1. Thus, the rank of the original word is 10 +6+6+1 = 23. 
Next, let us unrank 91 to obtain a word w in R(a?b%c?). To determine the first letter of 


w, note that 0 < 91, (; 35) = 60 < 91, but (, 3) + (23.5) = 150 > 91. Thus, the first letter 


is b, and we continue by unranking 91 — 60 = 31 to obtain a word in R(a?b?c?). This time, 


we have 0 < 31, ges) = 30 < 31, but Gee) + ea) = 60 > 31. So the second letter is 


b, and we continue by unranking 31 — 30 = 1 to obtain a word in R(a?b'c?). It is routine 
to check that the next two letters are both a, and we continue by unranking 1 to obtain 
a word in R(a°b'c?). The word we get is cbc, so the unranking map sends 91 to the word 
w = bbaacbe. 
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6.9 Ranking Integer Partitions 


In this section, we devise ranking and unranking algorithms for integer partitions by apply- 
ing the Bijective Sum Rule to the recursion 2.38. Let P(n,k) be the set of integer partitions 
of n with largest part k, and let p(n, k) = |P(n,k)|. We have seen that these numbers satisfy 
the recursion 

p(n, k) = p(n—k,k)+p(n—-1,k-1) for0<k<n. (6.3) 


The first term on the right counts elements of P(n,k) in which the largest part occurs 
at least twice (deleting the first copy of this part gives a bijection onto P(n — k,k)). The 
second term on the right counts elements of P(n,k) in which the largest part occurs exactly 
once (reducing this part by 1 gives a bijection onto P(n — 1,k — 1)). Combining these 
bijections with the Bijective Sum Rule, we obtain recursively determined ranking maps 
Tk? P(n,k) > [p(n, k)]]. To find rp. (44), consider three cases. If js has only one part (which 
happens when & = n), return 0. If k = wy = po, return ry—x,n((M2, 3,---)). If kk = wr > pa, 
return p(n—k, k)+rn—1,4-1((41 —-1, e,...)). The unranking maps un,x : [p(n, &)] + P(n, k) 
operate as follows. To compute u(m) where 0 < m < p(n,k), consider three cases. If k =n 
(so m = 0), return uw = (n). If 0 < m < p(n—k,k), recursively compute v = Un—x,n(m) 
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and return the answer p = (k,11,12,...). If p(n —k,k) < m < p(n,k), recursively compute 
V = Un—1,h—-1(m — p(n — k,k)) and return the answer ps = (1 + 1,12, 3,...). 


6.25. Example. Let us compute rg.3(ss), where ps = (3,3,1,1). Since 1 = pe, the rank is 
15,3(V), where v = (3,1,1). Next, since 1) 4 12, we have 


15,3(3, 1,1) = p(2,3) + 1r4,2(2, 1,1) = r4.2(2, 1,1). 
The first two parts of the new partition are again different, so 
r4o(2, 1, 1) => p(2, 2) + rgi(, 1, 1) = 1 + r3i(l, 1, 1). 


After several more steps, we find that r3,1(1,1,1) = 0, so rg.3(y) = 1. Thus pu is the second 
partition in the listing of P(8,3) implied by the ranking algorithm; the first partition in 
this list, which has rank 0, is (3,3, 2). 

Next, let us compute = uio,4(6). First, p(6,4) = 2 < 6, so uz is obtained by adding 1 
to the first part of v = ug,3(4). Second, p(6,3) = 3 < 4, so v is obtained by adding 1 to 
the first part of p = ug.2(1). Third, p(6,2) = 3 > 1, so p is obtained by adding a new first 
part of length 2 to € = ug 2(1). Fourth, p(4,2) = 2 > 1, so € is obtained by adding a new 
first part of length 2 to ¢ = w4,9(1). Fifth, p(2,2) = 1 < 1, so ¢ is obtained by adding 1 to 
the first part of w = u31(0). We must have w = (1,1,1), this being the unique element of 
P(3,1). Working our way back up the chain, we successively find that 


¢=(2, il, 1); € = (2,2, 1, 1), 2= (2,2,2,1,1), vy = (3, 2,2, 1, 1), 
and finally 4 = uyo,4(6) = (4, 2,2, 1,1). 


Now that we have algorithms to rank and unrank the sets P(n,k), we can apply the 
Bijective Sum Rule to the identity 


p(n) = p(n,n) + p(n,n — 1) +--+ + p(n, 1) 
to rank and unrank the set P(n) of all integer partitions of n. 


6.26. Example. Let us enumerate all the integer partitions of 6. We obtain this list of 
partitions by concatenating the lists associated to the sets 


P(6,6), P(6,5), --- , P(6,1), 


written in this order. In turn, each of these lists can be constructed by applying the un- 
ranking maps ue, to the integers 0,1,2,...,p(6,k) — 1. The reader can verify that this 
procedure leads to the following list: 


(6), (5,1), (4,2), (41,1), (3,3), (8,2,1), 3,1,1, 9), 
(2, 2,2), (2,2, 1,1), (2,1,1,1, i) (51,151, 1,1). 


It can also be checked that the list obtained in this way presents the integer partitions 
of n in decreasing lexicographic order. 


DT 


6.10 Ranking Set Partitions 


Next, we consider the ranking and unranking of set partitions (which are counted by Stirling 
numbers of the second kind and Bell numbers). The recursion for Stirling numbers involves 
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both addition and multiplication, so our recursive algorithms use both the Bijective Sum 
Rule and the Bijective Product Rule. 

Let SP(n,k) be the set of all set partitions of {1,2,...,n} into exactly k blocks, and 
let S(n,k) = |SP(n,k)| be the associated Stirling number of the second kind. Recall from 
Recursion 2.45 that 


S(n,k) = S(n-1,k-1)+kS(n-1,k) forO<k<n. 


The first term counts set partitions in S'P(n,k) such that n is in a block by itself; removal 
of this block gives a bijection onto SP(n — 1,4 —1). The second term counts set partitions 
m in SP(n,k) such that n belongs to a block with other elements. Starting with any set 
partition 7’ in SP(n — 1,k), we can build such a set partition 7 € SP(n,k) by adding 
n to any of the k nonempty blocks of 7’. We index the blocks of z’ using 0,1,...,4 —1 
by arranging the minimum elements of these blocks in increasing order. For example, if 
nm = {{6,3,5}, {2}, {1,7}, {8,4}}, then block 0 of 7’ is {1,7}, block 1 is {2}, block 2 is 
{3,5,6}, and block 3 is {4, 8}. 

The ranking maps ry, : SP(n,k) + [S(n,k)] are defined recursively as follows. Use the 
only possible maps if k < 0 or k > n. For 0 < k <n, compute ry, (7) as follows. If {n} € , 
return the answer rp—1,4-1(7—{{n}}). Otherwise, let 7’ be obtained from 7 by deleting n 
from whatever block contains it, and let 7 be the index of the block of 7 that contains n. 
Return S(n — 1,k — 1) + pp,g(n—1,k) (4, Pn-1,4 (7). 

We define the unranking maps Un,x : [|S(n, k)] + SP(n,k) as follows. Assume 0<k <n 
and we are computing un,4.(m). If0 < m < S(n—1,k—1), then return upn—1,,-1(m)U{{n}}. 
If S(n-1,k-—1) < m < S(n,k), first compute (i,j) = Pr g(n—1,a) (™ — S(n-—1,k—1)). 
Next, calculate the partition 7’ = un—1,x(j) by unranking 7 recursively, and finally compute 
m by adding n to the th block of x’. 


6.27. Example. Let us compute the rank of 7 = {{1,7}, {2,4, 5}, {3, 8}, {6}} relative to 
the set SP(8,4). In the first stage of the recursion, removal of the largest element 8 from 
block 2 leaves the set partition 2’ = {{1, 7}, {2,4,5}, {3}, {6}}. Therefore, 


rga(t) = S(7,3) +29(7,4) +r7a(n’) = 301 +2-350+r7a(r’). 


(See Figure 2.20 for a table of Stirling numbers, which are calculated using the recursion 
for S(n,k).) In the second stage, removing 7 from block 0 leaves the set partition 7” = 


{{1}, {2, 4, 5}, {3}, {6}}. Hence, 
r7a(m’) = S(6,3) +05(6,4) + 7r64(7”) = 90+ 76,4(7"). 


In the third stage, removing the block {6} leaves the set partition r) = {{1}, {2, 4,5}, {3}}, 
and 

r6,4(7”) = 15,3(7)), 
In the fourth stage, removing 5 from block 1 leaves the set partition m) = {{1}, {2,4}, {3}}, 


and 
r5,3(m)) = $(4,2) + 18(4,3) + ra3(n) =74+6 4 ra3(r). 


In the fifth stage, removing 4 from block 1 leaves the set partition ©) = {{1}, {2}, {3}}, 
and 
raa(m)) = S(3,2) + 15(3, 3) +73,3(n©) =3+1+473,3(n®). 


Now r3,3(7) is 0, since |SP(3,3)| = 1. We deduce in sequence 


ra3(m4)) = 4, re4(a") =15,3(n) = 17, 17,4(x") = 107, rg4(m) = 1108. 
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Next, let us compute w7,3(111). The input 111 weakly exceeds S(6,2) = 31, so we must 
first compute P, og tlt — 31) = (0,80). This means that 7 goes in block 0 of ug¢.3(80). 
Now 80 > S(5,2) = 15, so we compute P3,95 (80 — 15) = (2,15). This means that 6 goes 
in block 2 of us.3(15). Now 15 > S(4,2) = 7, so we compute p3,6(15 — 7) = (1,2). This 
means that 5 goes in block 1 of u4,3(2). Now 2 < $(3,2) = 3, so 4 is in a block by itself 
in u43(2). To find the remaining blocks, we compute w3,2(2). Now 2 > S$(2,1) = 1, so 
we compute p;1(2— 1) = (1,0). This means that 3 goes in block 1 of u2,2(0). Evidently, 
uz2(0) = {{1},{2}}. Using the preceding information to insert elements 3,4,...,7, we 
conclude that 

u7,3(111) = {{1, 7}, {2, 3, 5}, {4, 6} }. 


The ranking procedure given here lists the objects in SP(n,k) in the following order. 
Set partitions with n in its own block appear first. Next come the set partitions with n in 
block 0 (i.e., n is in the same block as 1); then come the set partitions with n in block 1, 
etc. By applying the sum and product bijections in different orders, one can obtain different 
listings of the elements of S'P(n, k). 

Let SP(n) be the set of all set partitions of n, so |SP(n)| is the nth Bell number. The 
preceding results lead to ranking and unranking algorithms for this collection, by applying 
the Bijective Sum Rule to the disjoint union 


SP(n) = SP(n,1)USP(n,2)U---USP(n,n). 


Another approach to ranking SP(n) is to use Recursion 2.46 for Bell numbers; see the 
exercises for details. 
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6.11 Ranking Trees 


By the Rooted Tree Rule in §3.7, there are n”~? rooted trees on the vertex set {1,2,...,n} 
rooted at vertex 1. Let B be the set of such trees; we seek ranking and unranking algorithms 
for B. One way to obtain these algorithms is to use the bijective proof of the Rooted Tree 
Rule. In that proof, we described a bijection ¢! : B + A, where A is the set of all functions 
f : {1,2,...,n} > {1,2,...,n} such that f(1) = 1 and f(n) = n. Let C be the set of 
words of length n — 2 in the alphabet {0,1,...,n —1}. The map wv : A > C such that 
w(f) = wi-++Wn—2 with w; = f(¢+1)—1, is a bijection. The map pp. »...., 
from C = [n]”"~? to [n”~?]. Composing all these bijections, we get the required ranking 
algorithm. Inverting the bijections gives the associated unranking algorithm. 


6.28. Example. Consider the rooted tree T’ shown in Figure 3.9. In Example 3.49, we 
computed ¢'(T) to be the function g such that 


(g(1), 9(2),---, 9(9)) = (1, 2, 2, 9, 9, 7,6, 9, 9). 


yeady 


the given word as a number written in base 9), we find that rk(T’) = 649, 349. 


This application shows how valuable a bijective proof of a counting result can be. If 
we have a bijection from a complicated set of objects to a simpler set of objects (such as 
functions or words), we can compose the bijection with standard ranking maps to obtain 
ranking and unranking algorithms for the complicated objects. In contrast, if a counting 
result is obtained by an algebraic manipulation involving division or generating functions, 
it may not be so straightforward to extract an effective ranking mechanism. 
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6.12 The Successor Sum Rule 


We turn now from ranking algorithms to successor algorithms. Recall the setup from 86.1: 
given a totally ordered finite set S', our goal is to find algorithms first, last, and next such 
that first(S) is the least element of S, last(S) is the greatest element of S, and next(z, S) 
returns the immediate successor of x. Our strategy is to develop versions of the Bijection 
Rule, the Sum Rule, and the Product Rule that automatically create these subroutines. 


6.29. Example: Successor Algorithm for [n]. Consider the set [n] = {0,1,2,...,n—1} 
with the standard ordering 0 < 1 <--- <n—1. We define first([n]) = 0, last([n]) = n—1, 
and next(a, [n]) =a«+1 for0<a<n-1. 


6.30. The Bijection Rule for Successor Algorithms. Suppose fF: X > Y isa 
bijection with inverse G, and we already know successor algorithms for the finite set Y. 
There are successor algorithms for X defined by first(X) = G(first(Y)), last(X) = 
G(last(Y)), and next(z, X) = G(next(F (x), Y)) for x € X with x # last(X). 


6.31. The Successor Sum Rule. Assume the nonempty set S is the union of pairwise 
disjoint finite sets S),...,5,, and we already know successor algorithms for each $;. We 
can then define a successor algorithm for the set S using the pseudocode in Figure 6.4. 


Assumptions: S is the disjoint union of finite sets S_1,...,S_k; 
S is nonempty; we already know successor subroutines for each S_i. 


define procedure first(S): 
{ i=1; while (S_i is empty) do { i=i+1; } 
return first(S_i); 


¥ 


define procedure last(S): 
{ i=k; while (S_i is empty) do { i=i-1; } 
return last(S_i); 


} 


define procedure next(x,S): %/ assumes x is not last(S) 
{ find the unique i with x in S_i; 
if (x==last(S_i)) then 
{ j=it+1; while (S_j is empty) do { j=j+1; } 
return first(S_j); 
} 
else return next(x,S_i); 


} 


FIGURE 6.4 
Pseudocode for the Successor Sum Rule. 


Figure 6.5 provides visual intuition for what the Successor Sum Rule is doing in the 
case where every S; is nonempty. We are totally ordering the set S = S; US2U---US; by 
putting all the elements of Sj first, then all the elements of S2, and so on. The first element 
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first (S) last (S) 
| St S, S, S, | 
S | | >| 
ry ry 
4 yi v4 
first ($1) first (S2) firet ($3) first (S4) 
last (S1) last (S2) last (S3) last (S4) 
FIGURE 6.5 


Schematic diagram for the Successor Sum Rule. 


of S is the first element of S;, and the last element of S is the last element of S;. Given 
x € S, we compute next(z, S') as follows. If a is the last element in S$; where i < k, the 
successor of x in S is the first element in $j. If « € S; is not last in $;, the successor of x 
in S is the same as the successor of x in Sj. 

The pseudocode in the general case is a bit more complex since the subroutines must 
skip over any sets 5S; that happen to be empty. We see in the next section that the Successor 
Sum Rule often leads to recursively defined successor algorithms. Although all the sets S; 
at the top level of the recursion are typically nonempty, it is quite possible that empty sets 
will be encountered in one of the recursive calls. 
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6.13 Successor Algorithms for Anagrams 


We now illustrate the rules in the previous section by converting a combinatorial recursion 
for multinomial coefficients into a successor algorithm for sets of anagrams. This algorithm 
specializes to give successor algorithms for permutations and subsets of a fixed size. 


6.32. Successor Algorithm for Anagrams. Let {a1 < a2 < --: < ax} be a fixed 
ordered alphabet. We simultaneously construct successor algorithms for all the anagram 
sets R(a}'ay?---a;*) consisting of rearrangemenets of n; copies of a;. We assume that 
n, = 0 for 1 <i< k, so that each set of anagrams is nonempty. For a fixed sequence 
(n1,-..,Nx), note that S = R(aj*a}?---az*) is the disjoint union of sets $1, S2,..., Sz, 
where S; is the set of all words in S (if there are any) starting with letter a;. When S; is 
nonempty (which happens iff n; > 0), deleting the first letter a; of a word in S; gives a 
bijection from $; onto the anagram set R(a7! ---ayi~!.--aj*). We can assume by induction 
on n =n, +---+n, that successor algorithms are already available for each of the latter 
sets. Using the successor rules, we are led to the recursive algorithm shown in Figure 6.6. 
It can be checked that the first subroutine returns the word aja}? ---a,;", whereas the 
last subroutine returns the reversal of this word. It can also be proved by induction that 


this successor algorithm generates the words in each anagram class in alphabetical order. 


As an example, let us find the successor of x = bdcaccb in the anagram set R(a!b’c#d'). 
In the first call to next, we have i = 2, a; = b, and y = dceaccb. This word is not last in 
its anagram class, so we continue by finding next(y). Repeated recursive calls lead us to 
consider the words caccb, then accb, then ccb. But ccb is the last word in R(a°b'c?d°). To 
continue calculating next(accb), we therefore look for the next available letter after a, which 
is b. We concatenate b and first(R(a!b°c?d°)) = acc to obtain next(accb)=bacc. Going 
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Assumptions: n[1]...n[k] is an array where n[i] is the number of copies 
of letter a_i in the anagram set under consideration; all n[i] >= 0. 


define procedure first(n[1]...n[k]): 
{ if (k==0) then return the empty word; 
i=1; while (n[i]==0) do { i=it+1; } 
n[{ij=n[i]-1; 
return a_i.first(n[1]...n[k]); %% here . denotes concatenation of words 


} 


define procedure last(n[1]...n[k]): 
{ if (k==0) then return the empty word; 
i=k; while (n[i]==0) do { i=i-1; } 
n[i]=n[i]-1; 
return a_i.last(n[1]...n[k]); %% here . denotes concatenation of words 


} 


define procedure next(x,n[1]...n[k]): %% assumes x is not last(S), so k>0 
{ let x=a_i.y; 
let m[iJ=n[i]-1, m[s]=n[s] for all s unequal to i; 
if (y==last(m[1]...m{[k])) then 
{ j=it1; while (n[j]==0) do { j=j+1; } 
let pljlJ=n[j]-1, pls]=n[s] for all s unequal to j; 
return a_j.first(p[1]...p[k]); 
} 
else return a_i.next(y,m[1]...m[k]); 


} 


FIGURE 6.6 
Pseudocode for the Anagram Successor Algorithm. 


back up the chain of recursive calls, we then find next (caccb)=cbacc, next (dcaccb)=dcbace, 
and finally next (bdcaccb)=bdcbacc. 

If we apply the next function to the new word bdcbacc, we strip off initial letters one 
at a time until we reach the suffix cc, which is the last word in its anagram class. So, to 
compute next(acc), we must find the next available letter after a, namely c, and append to 
this letter the word first(R(a'b’c!d°)) = ac. Thus, next(acc)=cac, and working back up 
the recursive calls leads to a final answer of bdcbcac. 

The pattern in these examples applies in general, leading to the following non-recursive 
description of the next subroutine for anagrams. To compute the next word after w = 
W1W2°*+ Wn, find the largest position 2 such that letter w; precedes letter w;+, in the given 
ordering of the alphabet. Modify the suffix w;---w, by replacing w; with the next larger 
letter appearing in this suffix, and then sorting the remaining letters of this suffix into 
weakly increasing order. For example, next(cbcbdca)=cbccabd. 


6.33. Successor Algorithm for Permutations. Permutations of the alphabet 
{a1,@2,...,a%} are the same thing as anagrams in R(aja}---a},). Thus the successor algo- 
rithm for anagrams specializes to give a successor algorithm for permutations. We can use 
the simplified (non-recursive) versions of the first, last, and next subroutines derived 
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above. The permutations of {1,2,3,4} are generated in the following order: 
1234, 1243, 1324, 1342, 1423, 1432, 2134, 2143, ..., 4321. 


As another example, repeatedly applying next starting with the permutation 72648531, we 
obtain 72651348, then 72651384, then 72651438, then 72651483, and so on. 


6.34. Successor Algorithm for k-Element Subsets. Suppose Z = {z1,...,2n} isa 
given set of size n, and P;,(Z) is the set of all k-element subsets of Z. To obtain a successor 
algorithm for P,(Z), we define a bijection F : Py(Z) 4 R(1*0"-*) and then apply the 
successor algorithm for anagrams. The bijection F maps a subset S of Z to the word 
F(S) = wy, +++ wn € R(1*0"—*), where w; = 1 if z; € S and w; = Oif z; ¢ S. As an example, 
let n=5, k = 2, and Z = {1,2,3,4,5}. Using the alphabet ordering 1 < 0 on R(170°), the 
successor algorithm generates the words in this anagram class in this order: 


11000, 10100, 10010, 10001, 01100, 01010, 01001, 00110, 00101, 00011. 
Applying F~', the associated subsets are: 


{1,2}, {1,3}, {1,4}, {1,5}, {2,3}, {2,4}, {2,5}, {38,4}, {38,5}, {4,5}. 


In general, the method used here lists k-element subsets of {1,2,...,n} in lexicographic 
order. Using the alphabet ordering 0 < 1 would have produced the reversal of the list 
displayed above. 

In contrast, the ranking method discussed in §6.7 lists the subsets according to a different 
ordering, in which all subsets not containing n are listed first, followed by all subsets that do 
contain n, and so on recursively. The latter method produces the following list of subsets: 


{1,2}, {1,3}, {2,3}, {1,4}, {2,4}, {3,4}, {1,5}, {2,5}, {38,5}, {4,5}. 


( 


6.14 The Successor Product Rule 


This section develops a version of the Product Rule for successor algorithms. We begin with 
a rule for the Cartesian product of two sets. 


6.35. The Successor Product Rule for Two Sets. Suppose S$ and T are finite, 
nonempty sets, and we know successor algorithms for S and T. The following subroutines 
define a successor algorithm for S x T: 


first(S xT) = (first($),first(T)); 
last(S xT) = (last(S),last(T)); 
(a, next(b, T)) if b A last(T); 
next((a,b),SxT) = { (next(a,S),first(T)) if b=last(T). 


Intuitively, to find the successor of (a, b), we hold the first coordinate a fixed and replace 
b by its successor in T. This works unless 6 is the last element of T, in which case we 
replace a by its successor in S and replace b by the first element of T. As in the case of 
ranking algorithms, the successor algorithm for S' x T’ can be derived as a special case of the 
Successor Sum Rule. To do so, write S = {s1,...,5m} (using the ordering determined by 
the given successor algorithm for S) and view S x T as the union of the pairwise disjoint sets 
S; = {s;} x T. The pseudocode in Figure 6.4 specializes to the formulas given in Rule 6.35. 
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Similarly, by iterating the Successor Product Rule for Two Sets, we are led to the 
Successor Product Rule for a Cartesian product of k nonempty sets. Combining this rule 
with the Bijection Rule for Successor Algorithms, we obtain the general version of the 
Successor Product Rule given by the pseudocode in Figure 6.7. 


Assumptions: We know a bijection F:S_1 x ... x S_k -> X 
and its inverse G:X -> S_1 x ... x S_k; we already know 
successor algorithms for each nonempty finite set S_i. 


define procedure first(S): 
{ for i=1 to k do 
{ x_i = first(S_i); } 
return F(x_1,...,x_k); 


} 


define procedure last(S): 
{ for i=1 to k do 
{ x_i = last(S_i); } 


return F(x_1,...,x_k); 
} 
define procedure next(x,S): %/ assumes x is not last(S) 
{ compute (x_1,...,x_k) = G(x); 

i=k; while (x_i == last(S_i)) do 

{i= i-1; } 


x_i = next(x_i,S_i); 
for j=itl to k do 

{ x_j = first(S_j); } 
return F(x_1,...,x_k); 


} 


FIGURE 6.7 
Pseudocode for the Successor Product Rule. 


6.36. Example. If we apply the Successor Product Rule to the product set [b]*, the 
resulting successor algorithm implements counting in base b. For example, taking b = 3 and 
k = 4, we obtain the following sequence of words: 


0000, 0001, 0002, 0010, 0011, 0012, 0020, 0021, 0022, 0100, ..., 2222. 


6.37. Example. Consider license plates consisting of three letters followed by four digits. 
We compute next (WYZ-9999)=WZA-0000 and next (ZZZ-9899)=ZZZ-9900. 


6.38. Example. For the set S of four-letter words that begin and end with consonants and 
have a vowel in the second position, we find that next(duzz)=faab and next(satz)=saub. 
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6.15 Successor Algorithms for Set Partitions 


In this section, we build successor algorithms for the sets SP(n,k) consisting of all set 
partitions of {1,2,...,n} having k blocks. We do this by applying the Sum Rule, Product 
Rule, and Bijection Rule for Successor Algorithms to the recursion for Stirling numbers of 
the second kind. 

Recall that S(n,k) = |SP(n,k)| satisfies initial conditions S(n,n) = 1 for all n > 0, 
S(n,1) =1 for alln > 1, and S(n,k) =0 when k =0<nork>n. Forl1<k <n, these 
numbers satisfy the recursion 


S(n,k) = S(n-1,k-1) + kS(n—-1,k), 


where both terms on the right side are nonzero. The first term counts set partitions in 
SP(n,k) where n appears in a block by itself, and the second term counts set partitions 
where n appears in a block with other elements. As in §6.10, we number the blocks of a set 
partition 7 from 0 to k — 1, ordering the blocks by increasing minimum element. 

We now describe the first, last, and next subroutines for the sets S'P(n, k). Whenever 
S(n,k) = 1, the subroutines first and last return the unique set partition in SP(n,k). 
This set partition is {{1,2,...,n}} when & = 1, and it is {{1}, {2},...,{n}} when k =n. 
For 1 <k <n, the first object in SP(n,k) is defined recursively to be the first object in 
SP(n—1,k —1) with the block {n} adjoined. The last object in SP(n,k) is defined to be 
the last object in SP(n — 1,k) with n inserted into the block with the largest minimum 
element. 

Now suppose 7 € SP(n,k) is not the last element in this set. We define next(7) recur- 
sively via several cases. Case 1: {n} is one of the blocks of 7; let 7’ be m with this block 
removed. If z’ is not last in SP(n — 1,4 —1), then next(z) is next(z’) with {n} adjoined. 
If x’ is last in SP(n — 1,4 — 1), then next(z) is the first object in SP(n — 1,k) with n 
inserted into the lowest-numbered block. Case 2: n appears with other elements in the jth 
block of 1; let z’ be 7 with n deleted from its block. If x’ is not last in SP(n — 1,k), then 
next(7) is next(z’) with n inserted into the jth block. If 7’ is last in SP(n — 1,k), then 
next(7) is the first object of SP(n —1,k) with n inserted into block j + 1. This completes 
the recursive definition of the next subroutine. 

We can unravel the recursive calls to obtain more explicit descriptions of the successor 
subroutines. We find that the first and last objects in SP(n,k) are the set partitions 


first(SP(n,k)) ={{1,2,....»-—k+1},{n—k+2},...,{n—-1},{n}}; 


last(SP(n,k)) = {{1}, (2},....{k-1}, {kk +1,...,n—1n}}. 


For 7 € SP(n,k), we can compute next(7) as follows. Write out all the blocks of 7 and start 
erasing the symbols n, n — 1, n — 2, one at a time, until first encountering a set partition 
that is the last object in its class. As each symbol is erased, remember whether it was in 
a block by itself or the number of the block it was in. Suppose the final erased symbol is 
m, leaving the set partition 7’ that is last in SP(m — 1,71). Case 1: If m was in a block by 
itself just before being erased, replace x’ by the first set partition in SP(m—1,r+1), and 
insert m in the lowest-numbered block of this new set partition. Case 2: If m was in the jth 
block of x’ just before being erased, replace 7’ by the first set partition in SP(m — 1,r), 
and insert m in block 7 + 1 of this new set partition. In both cases, continue by restoring 
the symbols m+ 1 through n, one at a time, giving them the same block status they had 
originally. For example, if m+ 1 was originally in a block by itself when it got erased, put 
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it back into the set partition in a block by itself. If instead m-+ 1 was erased from the sth 
block, put it back into the sth block of the current set partition. 

Here are some examples of the algorithm. Given 7 = {{1}, {2,8}, {3, 4,5, 6}, {7,9}}, we 
delete 9 from block 3, then delete 8 from block 1, then delete 7 from a block by itself. We 
now have the set partition a’ = {{1}, {2}, {8, 4,5, 6}}, which is the last object in SP(6, 3). 
Following Case 1 with m = 7, we replace 7’ with the first object in SP(6,4), namely 
{{1, 2,3}, {4}, {5}, {6}}. Now we add 7 to block 0, add 8 to block 1, and add 9 to block 
3, obtaining the final output next(7) = {{1,2,3,7}, {4,8}, {5}, {6,9}}. To find the next 
object after this one, we delete 9 from block 3, 8 from block 1, 7 from block 0, 6 from 
a block by itself, 5 from a block by itself, and 4 from a block by itself, finally obtaining 
{{1, 2,3}} which is last in its class. Applying Case 1 with m = 4, we pass to the first object 
in SP(3,2), namely {{1,2},{3}}, and put 4 in block 0. Restoring the remaining elements 
eventually produces the set partition {{1, 2,4, 7}, {3,8}, {5}, {6, 9}}. 

Now suppose 7 = {{1}, {2,7}, {3, 4,5, 6}, {8,9}}. To compute the next object, delete 9 
from block 3, then delete 8 from a block by itself, then delete 7 from block 1, producing the 
set partition 7’ = {{1}, {2}, (3,4, 5, 6}} that is the last object in SP(6,3). Following Case 2 
with m = 7, we replace x’ by the first object in SP(6,3), namely {{1,2,3,4}, {5}, {6}}. 
Now we add 7 to block 2, then add 8 in a block by itself, then add 9 to block 3, ob- 
taining next(7) = {{1,2,3,4}, {5}, {6, 7}, {8,9}}. The successor of this object is found 
to be {{1, 2,3, 5}, {4}, {6, 7}, {8, 9}}. To find the successor of {{1, 8}, {2,7}, {3, 6}, {4, 5}4, 
remove 8 from block 0, remove 7 from block 1, and remove 6 from block 2, producing 
the object {{1}, {2}, {3}, {4,5}} that is last in SP(5,4). The first object in S'P(5,4) is 
{{1, 2}, {3}, {4}, {5}}. Now we add 6 to block 3, add 7 to block 1, and add 8 to block 0, 
obtaining the answer {{1, 2,8}, {3, 7}, {4}, {5,6}}. 


essa 


6.16 Successor Algorithms for Dyck Paths 


We now use the First-Return Recursion for Catalan Numbers from 82.10 to develop a 
successor algorithm for Dyck paths. Let DP(n) denote the set of Dyck paths of order n; 
recall that such a path is a sequence of n north steps and n east steps that never go below 
the line y = x. The Catalan numbers C, = |DP(n)| satisfy the recursion 


Cn = CoCn-1 + CiCn—-2 +++: + Ce-1Cn_-k +-+-+Cn_1Co for alln > 0, (6.4) 


with initial condition Co = 1. The term Cy_1C,_» in this recursion counts Dyck paths of 
the form + = Nz,Ez2, where 7 is a Dyck path of order & — 1 (shifted to start at (0,1)) 
and 72 is a Dyck path of order n — k (shifted to start at (k,k)); the index k records where 
the full path 7 first returns to the diagonal line y = x. 

Applying the Successor Sum Rule and the Successor Product Rule to the recursion 
above, we are led to the following successor subroutines. Define first(Co) and last(Cp) 
to be the unique path of length zero. For n > 0, recursively define 


first(C,,) =Nfirst(Co))Efirst(C,_1); last(C,) = Nlast(C,_1) Elast(Co). 
Expanding these recursive definitions, we find that 
first(C,,) = NENENE---NE=(NE)";  last(C,,) = NN---NEE---E=N"E”. 


To compute next(7) for 7 € Cy, first find the first-return factorization 7 = NaEmo 
where 7, € DP(k —1) and m2 € DP(n—k). We use three cases to define 7* = 
next(7, DP(n)). 
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1. Ifa Alast(DP(n —k)), let m* = Nr) Enext(m2, DP(n — k)). 
2. If m2 =last(DP(n—k)) and 7 4 last(DP(k — 1)), let 


m* = Nnext(m, DP(k — 1))Efirst(DP(n—k)). 


3. If m2 =1last(DP(n—k)) and 7, = last(DP(k — 1)), let 


m* = Nfirst(DP(k)) Efirst(DP(n—k-—1)). 


6.39. Example. Let us compute next(N NENNEE E NNNEEE,DP(7)). The given input 
a factors as 7 = Na ,Em2 where 7; = NENNEE and 72 = NNNEEE. Here, k = 4, k—1 = 3, 
and n—k = 3. Since m2 = last(DP(3)) but 7, 4 last(DP(3)), we follow Case 2. We must 
now recursively compute 7, = next(7, DP(3)). Here, 7 factors as 71 = Na3E74, where 
m3 is the empty word and 74 = NNEE. This time, 74 is the last Dyck path of order 2 and 
m3 is the last Dyck path of order 0. Following Case 3, we have 


m, = Nfirst(DP(1)) Efirst(DP(1)) =N NEE NE. 
Using this result in the original calculation, we obtain 


next() = N 7} E first(DP(3)) = N NNEENE E NENENE. 


———— 


Summary 


e Definitions. For each positive integer n, let [n] = {0,1,2,...,n— 1}. Given an n-element 
set S,a ranking map for S is a bijection rkg : S > |[n] given by a specific algorithm. An 
unranking map for S is a bijection unrkg : [n] > S given by a specific algorithm. Given 
a total ordering sp < 81 <--++ < 5-1 of S, a successor algorithm relative to this ordering 
consists of subroutines first, last, and next, where first(S) = so, last(S) = sp—1, 
and next(s;,.S) = 5:41 for0 <i<n-—1. 


e The Bijective Sum Rule. Suppose a set S is the union of pairwise disjoint finite sets 
Si,...,S,. Given ranking maps f; : S$; — [ni], there is a ranking map f : S > 
[ni +--+: + nx] given by 


f(x) = Son + fila) for x € Sj. 


j<i 


The map f is denoted a f; and depends on the given ordering of the sets S;. To 
compute f~'(a), find the unique index 7 such that ysis LA < Vij<, nj, and let 
f-\(a) = f; *(a- je; 09). See the pseudocode in Figure 6.1. 

e The Bijection Rule for Ranking Maps. If F': X > Y is a bijection and rk: Y + [n] isa 


ranking map for Y, then rkoF is a ranking map for X. If unrk: [n] > Y is an unranking 
map for Y, then F~! o unrk is an unranking map for X. 


Product Maps. Given positive integers nj, n2,...,nx, there is a bijection 


Pnyyno,....ny 2 [Mal] X [na] x +++ < [re] > [nine:--ng] given by 
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Dnyjng....ng (C1; C2)+++,Ck) = C1Ng ++ Ne + CaNg +++ Ne +--+ + CK-1Nk + Ce for 0 < G < ny. 
The following algorithm computes pz), (@): for i looping from k down to 1, let rj; = 
amod n;, then replace a by adiv n;; the output is (r1,...,7r,). If n; = b for every i, then 
Pp,...p(@) is the base b representation of a. 


e The Bijective Product Rule. Given sets $),...,S, with |S;| = n; and a bijection F : 
S, x--++x S_ — X, suppose we know ranking maps for each S;. A ranking map for X is 
given by 


tky (x) = pny,...n, (tks, (a1), -.-,Tks,(z~)) where (21,...,2%) = F~*(z). 
An unranking map for X is given by 
unrky (a) = F(unrkg, (c1),..., unrks,(cx)) where (c1,...,¢%) = Deas (a). 


This rule is implemented by the pseudocode in Figure 6.3. For the special case k = 2, see 
Figure 6.2. 


e Ranking Partial Permutations. The following rules rank k-permutations w = w,w2--: wr 
of an ordered alphabet A = {ap < a1 < +--+ < &p_1}. To find rk(w), first compute 
(j1,---;Je) by letting 7; be the number of letters preceding w; in A that are different 
from wy,...,Wj-1. Then calculate rk(w) = pnn-i,....n—k+1(J1,---;Jr). To unrank a, find 
ome pee ey = (j1,.--, jk), and then recover w1,..., wz from left to right by letting 
w; be the (j; + 1)th smallest letter in A—{wy,..., wi_1}. 


e Ranking Subsets. Suppose we are ranking k-element subsets of an ordered alphabet A = 


{ro < @1 < +++ < @p_i}. If B = {a;, < a, <--+ < a,}, then rk(B) = se (7). To 
unrank a given integer a, recover iz,...,%1 (in this order) by choosing the largest possible 


value that will not cause the partial sum of binomial coefficients (3) to exceed a. This 
ranking method leads to a listing of k-element subsets in which all subsets containing 
Ln—1 appear after all subsets not containing x,,_1; each sublist is ordered in the same way 
relative to %,_2, and so on. 


e Ranking Anagrams. The following recursive formula can be used to rank words w € 
Ra}? ---a;*) in alphabetical order: if w = a;w’, then 


rk(w) = ys ae + TT = +rk(w’). 


To unrank a given integer b, choose 7 as large as possible so that the sum in the previous 
formula does not exceed 6; subtract the sum for this choice of 7 from 6; unrank the result 
recursively; and prepend the letter a; to obtain the final answer. 


e The Successor Sum Rule. Assume S$ is the union of nonempty, pairwise disjoint finite 
sets S1,..., 5%. Given successor algorithms for each S;, we define first(S) = first(S1), 
last(S) = last(S;,), next(2,S) = next(a,S;) if x € S; is not last in $;, and next(x,S) = 
first(Sj41) if x € S; is last in S;. When some sets S; may be empty, we modify these 
subroutines by skipping over any empty sets; see the pseudocode in Figure 6.4. 


e Successor Algorithm for Anagrams. Given the ordered alphabet {a1 < ag < --: < ax} 
and integers n; > 0, the first anagram in R(ay?---a;,") is aj'---az*; the last anagram 
is the reversal of this word. To obtain the next anagram after w = w,--: Wn, choose the 
maximal 7 with w; < wi41; modify w;---w, by replacing w; with the next larger letter 
in this suffix, then sorting the remaining letters of the suffix into weakly increasing order. 
See the pseudocode in Figure 6.6. This method can also be used to find successors of 
permutations and k-element subsets (viewed as binary words). 
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e The Successor Product Rule. Assume we know successor algorithms for the finite nonempty 
sets S1,...,5, and a bijection F : S$, x --- x S, 4 X. Define 


first(X) = F(first(S)),...,first(S;,)) and last(X) = F(last(S)),...,last(S;)). 
To compute next(a, X), first compute (71,...,v%) = F~'(x). Find the maximal index i 


with x; £ last(S;), replace x; by next(zx;, 5;), replace x; by first(S;) for all 7 >i, and 
return the answer F'(a1,..., 7%). See the pseudocode in Figure 6.7. 


(ie 
Exercises 


6-1. Suppose f : {a,b,c} > [8] and g : {d,e} — [2] are defined by f(a) = 1, f(b) = 2, 
f(c) = 0, g(d) = 1, g(e) = 0. Compute the bijections f +g and g + f. 

6-2. (a) Use the maps f and g in the previous exercise and the Bijective Cartesian Product 
Rule to find a bijection from {a,b,c} x {d,e} onto [6]. (b) Use f and g to build a bijection 
from {d,e} x {a,b,c} onto [6]. 

6-3. Let S={w<z<y< x} and T= {c <a < }d}. (a) Describe the first, last, and 
next functions constructed by the Successor Sum Rule for the set S UT. (b) Repeat (a) 
for the set TUS. (c) Find the successor subroutines constructed by the Successor Product 
Rule for the set S x T. (d) Repeat (c) for the set T x S. 

6-4. Compute: (a) p7,5(4,3); (b) p7,5(3,4); (¢) ps,7(4,3); (4) ps,7(3,4); (e) p75(22); 
(f) p5,7(22). 

6-5. Find: (a) p2,2,2,2,2(0,1,1,0,1); (b) p33.2,2,2(29); (©) pz,7,7(3,0,6); (d) p7,7,7(306); 
(e) Pi9,10,10(306). 

6-6. Find: (a) p5,4,3,2,1(3, 3, 0, 1,0); (b) p5.4.3.2.1(111); (c) 3,6,2,6(2, 5,0, 4); (d) 3.6,2,.6(150); 
(c) 76.2,6,3(150); (f) 16,6,3,2(150). 

6-7. Consider the product set X = [3] x [4]. (a) View X as the disjoint union of the sets 
X; = {i} x [4], for 2 = 0,1,2. Let f; : X; > [4] be the bijection f;(i, y) = y. Compute the 
bijections fo + fi + fe and fo+ fi + fo, which map X to [12]. (b) View X as the disjoint 
union of the sets XY) = [3] x {j}, for 7 = 0,1,2,3. Let g; : X“ — [3] be the bijection 
g;(z,j) = x. Compute the bijection go + g1 + g2+ 93 : X — [12]. (c) Compute the bijection 
3,4: X — [12]. Is this one of the maps found in (a) or (b)? (d) Let t : X — [4] x [8] be 
the bijection t(7,7) = (j,7). Compute the bijection p43 ot : X — [12]. Is this one of the 
maps found in (a) or (b)? 

6-8. Rank the following four-letter words: (a) alto; (b) zone; (c) rank; (d) four; (e) word. 
6-9. Unrank the following numbers in [264] to obtain four-letter words: (a) 115, 287; 

(b) 396, 588; (c) 392,581; (d) 338, 902; (e) 275, 497. 

6-10. (a) Rank the six-letter word “unrank.” (b) Unrank 199,247,301 to get a 6-letter word. 
(c) What happens if we unrank 199,247,301 to get a k-letter word where k > 6? 

6-11. A fraternity name consists of either two or three capital Greek letters. There are 24 
letters in the Greek alphabet, ordered as follows: 


ABTAEZHOIKAMN=OUPUTY OX. 


Assume an ordering of fraternity names consisting of all two-letter names in alphabetical or- 
der, followed by all three-letter names in alphabetical order. Compute the rank of: (a) ®BK; 
(b) AA; (c) AAA; (d) AXQ. Now, unrank: (e) 144; (f) 1440; (g) 13931. 
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6-12. Repeat (a)—-(g) of the previous exercise, assuming the names are ordered so that all 
three-letter names precede all two-letter names, with names of each length in alphabetical 
order. 


6-13. Repeat (a)—(g) of the previous exercise, assuming the names are ordered in alpha- 
betical order (so that, for example, AA is immediately preceded by ATC and immediately 
followed by AAA). 


6-14. Consider the set of four-digit even numbers (no leading zeroes allowed) that do not 
contain the digit 6. (a) Use the Product Rule to count this set. (b) Find a ranking bijection 
that will list these numbers in increasing numerical order. (c) Use (b) to rank 1234, 2500, 
and 9708. (d) Now unrank 1234, 2501, and 666. 


6-15. Consider five-letter palindromes, ranked in alphabetical order. (a) Rank the palin- 
dromes LEVEL and MADAM. (b) Unrank 1581 and 12,662. (c) Find the first and last 
palindromes in the ranking that are real English words. 


6-16. A Virginia license plate consists of three uppercase letters followed by four digits. For 
arcane bureaucratic reasons, license plate 0 is ZZZ-9999, followed by ZZZ-9998, ..., ZZZ- 
0000, ZZY-9999, etc. Use this system to rank the license plates: (a) ZCF-2073; (b) JXB-2007; 
(c) ABC-1234. Now, unrank: (d) 7,777,777; (e) 123,456,789. 


6-17. Repeat the previous exercise assuming a new ordering, honoring the 400th anniver- 
sary of Jamestown, where license plate 0 is JAM-1607, and license plates count forward in 
lexicographic order (wrapping around from ZZZ-9999 to AAA-0000). 

6-18. Let A = {a,b,c,d,e,f}. (a) Compute the ranks of bfde and fdac among all 4- 
permutations of A. (b) Unrank 232 to get a 4-permutation of A. 

6-19. Let A = {a,b,c,d,e,f,g}. (a) Compute the rank of ecagdb among all permutations of 
A. (b) Unrank 583 to get a permutation of A. 

6-20. (a) Compute the rank of 42153 among all permutations of {1,2,...,5}. (b) Unrank 
46 to obtain a permutation of {1,2,...,5}. 

6-21. (a) Compute the rank of 36281745 among all permutations of {1,2,...,8}. (b) Unrank 
23,419 to obtain a permutation of {1,2,...,8}. 

6-22. Let A = {a,b,c,d,e,f,g,h}. (a) Use the ranking formula for 4-element subsets of A to 
rank the subsets {a,c,e,g}, {b,c,d,h}, and {d,e,f,h}. (b) Unrank 30, 40, and 50 to obtain 
4-element subsets of A. 

6-23. (a) Devise a ranking algorithm for k-element subsets of an n-element alphabet based 
on the recursion C(n, k) = C(n—1, k—1)+C(n—1, k), which differs from the recursion in §6.7 
due to the reversal of the order of terms on the right side. (b) Describe informally the order 
in which the ranking algorithm in (a) will produce the k-element subsets. (c) Answer the 
ranking and unranking questions in the previous exercise using this new ranking algorithm. 
6-24. (a) Find the ranks of bbccacba and cabcabbe in the set R(a?b%c?), ordered alpha- 
betically. (b) Unrank 206 and 497 to get anagrams in R(a?b°c?). 

6-25. (a) Compute the rank of MISSISSIPPI among the set of all anagrams in R(I*MP?S*) 
(listed alphabetically). (b) Which anagram in this set has rank 33,333? 

6-26. (a) Use the ranking maps in §6.9 to rank the integer partitions (3, 3,3), (5, 2,2), and 
(4,3, 2,1). (b) Which integer partition in P(12,3) has rank 6? (c) Which integer partition 
in P(15,4) has rank 22? (d) Which integer partition in P(20,6) has rank 47? 

6-27. Use the ranking algorithm from §6.9 to list all integer partitions of 8 into four parts. 


6-28. Follow the method used in Example 6.26 to enumerate all integer partitions of 7. 


6-29. Use the ranking algorithm from §6.10 to list all set partitions of {1,2,3,4,5} with 
three blocks. 
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6-30. (a) Use the algorithms in §6.10 to rank the following set partitions relative to the 
set SP(n, k): {{1, 3}, {2, 4, 5}}; {{1, 5, 7}, {2}, {3, 4, 8}, {6}}. (b) Unrank 247 to obtain a set 
partition in SP(7,4). (c) Unrank 1492 to obtain a set partition in SP(8, 4). 

6-31. (a) Use the method of §6.11 to find the rank of the rooted tree 


T = {(1,1), (2, 1), (8, 1), (4, 1), (10, 1), (7, 1), (6, 7), (5, 6), (8, 5), (11, 6), (9, 11)}. 


(b) Unrank 1,609,765 to obtain a rooted tree on nine vertices rooted at vertex 1. 


6-32. Use the algorithms in §6.13 to find the successor of each word in the appropriate set 
of anagrams: (a) ccbabdc; (b) abcddcba; (c) 01101011; (d) 33212312; (e) UKULELE. 


6-33. Find the next ten anagrams following 112021220 in R(07192*). 


6-34. Find the successor of each permutation: (a) 3641275; (b) 6754321; (c) 135798642; 
(d) 123567984. 


6-35. Find the next ten permutations following 416573829. 

6-36. Find the successor of each subset in the set of all 4-element subsets of {a,b,c,d,e,f,g,h}: 
(a) {b,c,d,e}; (b) {a,c,f,h}; (c) {d,e,g,h}; (d) {a,b,c,h}. 

6-37. In the set of all 6-element subsets of {0,1,...,9}, find the next seven subsets following 
{2,3,4,6, 7, 8}. 

6-38. Find the successor of each set partition in SP(8,4): (a) {{1, 7}, {2}, {3, 5, 6}, {4, 8}}; 
(b) {{1, 2}, {3, 4}, {5, 6}, (7, 8h} (6) {{1, 4, 5}, (6, 8}, {2,3}, {7} 5; 

(d) {{1}, {2, 3, 6, 7}, {4, 5}, {8h}. 

6-39. Find the next ten set partitions following {{1, 4,6}, {2,5,9}, {3,7,8}} in SP(9, 3). 
6-40. Use the first-return recursion (6.4) to create ranking and unranking algorithms for 
Dyck paths. 


6-41. Use the unranking algorithm in the previous exercise to list all Dyck paths of order 
(a) 4; (b) 5. 

6-42. Rank the following Dyck paths. 

a) NNENEENNNEEE 

b) NNNEEENNENEE 

c) NNNENEENNEENNENEEE 

d) NENENENENNNNNEEEEE 


6-43. (a) Unrank 52 to get a Dyck path of order 6. (b) Unrank 335 to get a Dyck path of 
order 7. (c) Unrank 1000 to get a Dyck path of order 8. 

6-44. Find the successor of each Dyck path in DP(6). 

(a) NNENEENNNEEE 

(b) NNNEEENNENEE 

(c) NNENEENENENE 

(d) NENENENNENEE 


6-45. Find the next ten Dyck paths following NENNENNEENEENENNEENE in DP(10). 
6-46. Fix m,n © Zso. Write ranking and unranking algorithms for lattice paths from (0,0) 
to (mn,n) that never go below the line « = my based on Recursion 2.25. 

6-47. Write a successor algorithm for lattice paths from (0,0) to (mn,n) that never go 
below the line 7 = my based on Recursion 2.25. 

6-48. Use Recursion 2.22 to create ranking and unranking algorithms for the set of k-element 
multisets using an n-element ordered alphabet. 


6-49. Use the previous exercise to find the rank of each multiset using the alphabet 
{a,b,c,d,e,f}. (a) [b,b,c,d,d,d]; (b) [a,a,d,f,f]; (c) [a,c,d,f]; (d) [c,c,c,c,c]. 
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6-50. Unrank the following integers to get 6-element multisets using the alphabet 
{a,b,c,d,e}. (a) 132; (b) 31; (c) 207; (d) 99. 

6-51. Find a successor algorithm for the set of k-element multisets using an n-element 
ordered alphabet. 

6-52. Use the previous exercise to find the successor of each multiset using the alphabet 
{1,2,3,4,5, 6}: (a) [1,3, 4,6]; (b) [5,5,6, 6]; (c) [4,4, 4, 4]; (d) [2,5, 5,5}. 

6-53. Repeat the previous five exercises, but now use ranking and successor algorithms 
based on one of the bijections in §1.12. 

6-54. Write a successor algorithm for the set P(n,k) of integer partitions of n with first 
part k, based on the recursion (6.3). 

6-55. Use the previous exercise to find the successors of the following integer partitions: 
(a) (9,3); (b) (7,4, 2,1); (c) (3,3,1,1,1). 

6-56. (a) Write a successor algorithm for the set P’(n,k) of all integer partitions with 
exactly k parts. (b) Repeat the previous exercise using this successor algorithm. 

6-57. Let CP(n,k) be the set of permutations of {1,2,...,n} with k cycles. Create ranking 
and unranking algorithms for these sets based on Recursion 3.46. 

6-58. Use the previous exercise to rank the following permutations in the sets CP(n, k): 
(a) 35412; (b) 231564798; (c) 23451. 

6-59. Unrank the following integers to obtain permutations in CP(7, 3): (a) 377; (b) 901; 
(c) 1616. 

6-60. Write a successor algorithm for permutations in CP(n,k) based on Recursion 3.46. 

6-61. Use the previous exercise to find the successor of each permutation in CP(n,k): 
(a) 35412; (b) 231564798; (c) 23451. 

6-62. Create ranking and unranking algorithms for derangements based on Recursion 4.17. 

6-63. Use the previous exercise to rank the following derangements: (a) 43512; (b) 3527614; 
(c) 789123654. 

6-64. Unrank the following integers to obtain derangements in D7: (a) 35; (b) 419; (c) 1776. 
6-65. Write a successor algorithm for derangements based on Recursion 4.17. 

6-66. Use the previous exercise to find the successor of each derangement: (a) 25431; 
(b) 7543216; (c) 214365. 

6-67. Write ranking and unranking algorithms for the set SP(n) of all set partitions of an 
n-element set based on Recursion 2.46 for Bell numbers. 

6-68. Use the previous exercise to rank the following set partitions in SP(n). 


(a) {{1, 2, 4}, {3, 5, 6}, {7, 8}} 

(b) {{1, 7}, {2, 4, 5}, {3, 8}, {6}} 

(c) {{1, 8}, {2, 7}, (3, 6}, (4, 5}} 

6-69. Unrank the following integers to get set partitions of {1,2,...,8}: (a) 1394; (b) 2758; 
(c) 4026. 

6-70. Write a successor algorithm for the set SP(n) of all set partitions of {1,2,...,n} 
based on Recursion 2.46. 


6-71. Use the previous exercise to find the successor of each set partition. 

(a) {{1, 7}, {2, 4,5}, {3, 8}, 16h} 

(b) {{1, 8}, {2, 7}, 13, 6}, {4 5h} 

(c) {{1}, {2, 3, 4}, {5, 6, 7, 8}} 

6-72. Develop ranking and unranking algorithms for 231-avoiding permutations. Find the 
rank of 1524311761089. Unrank 231 to get a 231-avoiding permutation of length 7. 


270 Combinatorics, Second Edition 


6-73. Create ranking and unranking algorithms for the set of subsets of {1,2,...,n} that 
do not contain two consecutive integers. 

6-74. Use the previous exercise to rank the following subsets (take n = 10): (a) {1,3, 9}; 
(b) {2,5, 7,10}; (c) {2, 4,6, 9}. 

6-75. Unrank the following integers to get subsets of {1,2,...,10} with no two consecutive 
integers: (a) 1; (b) 42; (c) 130. 

6-76. Write a successor algorithm for the set of subsets of {1,2,...,n} that do not contain 
two consecutive integers. 

6-77. Use the previous exercise to find the successor of each subset: (a) {1,3,5}; 
(b) {2,5, 7,10}; (c) {4,6,8, 10}. 

6-78. Create algorithms to rank and unrank integer partitions of n into k distinct parts. 
Use your algorithms to rank the partition (10, 7,6,3,1) and unrank 10 to get a partition of 
20 into three distinct parts. 

6-79. We can view a deck of playing cards as the set D= Sx V, where S = {&, >, 0, @} is 
the set of suits and V = {A,2,3,4,5,6,7,8,9,10, J,Q, K} is the set of values. (a) Describe 
a ranking algorithm for D. Find the rank of 60, 2%, and K@. (b) Describe an unranking 
algorithm for D. Unrank 47, 13, and 31. (c) Describe a successor algorithm for D. Find the 
successor of 100, KY, and A@. 

6-80. Repeat the previous exercise, now viewing the deck of cards as the set D=V x S. 
6-81. (a) Rank the hand {5&, 7, 8@, 109, JV} among all possible poker hands. (b) Unrank 
1,159,403 to get one of the () possible poker hands. In this and later exercises, use the 
ordering of cards in the deck 


Ade < 2be<-:-< K&< AQ <-:-< ADV <::-< AA<-:-< KG. 


6-82. Create ranking and unranking maps for the set of flush poker hands by applying the 
Bijective Product Rule. (Include straight flushes.) 

6-83. Use the previous exercise to rank these flush poker hands: (a) {89,79, 109, JO, KO}; 
(b) {A@, 2, 40, 5&, 6A}; (c){40, 50,80, Q0, KO}. 

6-84. Unrank these integers to get flush poker hands: (a) 4716; (b) 2724; (c) 295. 

6-85. Create a successor algorithm for the set of flush poker hands. 


6-86. Use the previous exercise to find the successor of each flush poker hand. 

(a) {389, 79,109, JO, KO} 

(b) {Ad 20, 40, 5, 6a} 

(c) {90, 100, JO, Q0, KO}. 

6-87. Create ranking and unranking maps for the set of straight poker hands (including 
flushes). 

6-88. Use the previous exercise to rank these straight poker hands: (a) {3@, 40,50, 6@, 7&}; 
(b) {A@, 29, 30, 4, 59}; (c) {JY, QU, Ke, 100, Ad}. 

6-89. Unrank these integers to get straight poker hands: (a) 1574; (b) 8877; (c) 4900. 
6-90. Create a successor algorithm for the set of straight poker hands. 

6-91. Use the previous exercise to find the successor of each straight poker hand. 

(a) {3&, 40,50, 6@, 7h} 

(b) {Ad 20, 36, 40, 5d} 

(c) {JY, QU, KY, 109, AY} 

6-92. Create ranking and unranking maps for the set of four-of-a-kind poker hands. 
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6-93. Use the previous exercise to rank these four-of-a-kind hands: (a) {3@, 8U, 80, 8@, 8&0}: 
6-94. Unrank these integers to get four-of-a-kind hands: (a) 600; (b) 264; (c) 117. 

6-95. Create a successor algorithm for the set of four-of-a-kind poker hands. 

6-96. Use the previous exercise to find the successor of each four-of-a-kind hand. 

(a) {36, 89, 80, 8&, 8h} 

(b) {K@, 40, KO, Kb, KO} 

(c) {K@, 100, 10d, 10%, 10@} 

6-97. Create ranking and unranking maps for the set of full house poker hands. 

6-98. Use the previous exercise to rank these full house hands: (a) {J&, JO, J@, 9%, 99}. 
6-99. Unrank these integers to get full house hands: (a) 515; (b) 3082; (c) 483. 

6-100. Create a successor algorithm for the set of full house poker hands. 

6-101. Use the previous exercise to find the successor of each full house hand. 

(a) {K@, KY, 30, 3, 346} 

(b) {5@,50, KO, K@, KO} 

(c) {Jé, JV, 109, 100, 104} 

6-102. Create ranking and unranking maps for the set of three-of-a-kind poker hands. 
6-103. Use the previous exercise to rank these three-of-a-kind hands: 

(a) {J&, JO, Jb, 9h, 50}. (b) [3&, 39,30, Ad, 20}. (c) {Ad, AY, AY, QU, Kab}. 

6-104. Unrank these integers to get three-of-a-kind hands: (a) 21,751; (b) 8; (c) 50,004. 
6-105. Create a successor algorithm for the set of three-of-a-kind poker hands. 

6-106. Use the previous exercise to find the successor of each three-of-a-kind hand. 

(a) {J&, JO, J&, 9d, 50} 

(b) {KO, KO, K@, 54,690} 

(c) {3@, 39,30, Q@, Ka} 

6-107. Create ranking and unranking maps for the set of two-pair poker hands. 

6-108. Use the previous exercise to rank these two-pair hands: (a) {2@, 24, 9&,90, KO}; 
(b) {A%, AY, Th, QQ, QO}; (c) {KY, Kh, Qh, QQ, Jé}. 

6-109. Unrank these integers to get two-pair hands: (a) 71,031; (b) 99,482; (c) 1417. 
6-110. Create a successor algorithm for the set of two-pair poker hands. 

6-111. Use the previous exercise to find the successor of each two-pair hand. 

(a) {4 49, 80, 8@, Ka} 

(b) {K@, QO, KO, Qh, JO} 

(c) {A@, AD, 9&e, 10, 94} 

6-112. Fix s,t € Zso. (a) Use the Bijective Sum Rule to construct a bijection F from 
[s] x [¢] to [st] by writing the domain of F as the disjoint union of sets S; = [s] x {j} for 
j € [t]. (b) How is the map F related to the bijection p;,,? 

6-113. Fix s,t © Zso. Describe the successor algorithm for the product set [[s] x [¢] obtained 
from the Successor Sum Rule by writing this product set as the disjoint union of sets 
S; = [s] x {y} for 7 € [¢. 

6-114. Give careful proofs of the Bijective Sum Rules 6.2 and 6.3. 

6-115. Integer Division Theorem. Prove that for all a € Z and all nonzero b € Z, there 
exist unique g,r € Z with a = bqg+r and 0 < r < |b|. Describe an algorithm for computing 
q and r given a and b. 
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6-116. Prove that for all a € Z and all nonzero b € Z, there exist unique q,r € Z with 
a = b¢+r and —|b|/2 < r < |b|/2. Describe an algorithm for computing g and r given a 
and b. 


6-117. Prove that the algorithm in Remark 6.10 correctly computes p,_,, (@). 


6-118. Given a,b € Z with b > 0, recall that a mod b is the unique remainder r € [6] such 
that a = bqg+r for some q € Z. (a) Given s,¢ > 0, consider the map f : [st] > [s] x [¢] 
given by f(x) = (# mod s,x mod ¢) for x € [st]. Prove that f is a bijection iff gcd(s,t) = 1. 
(b) Generalize (a) to maps from [[s152--- 8%] to [s1]] x [s2] x --- x [sx]. (c) Can you find an 
algorithm for inverting the maps in (a) and (b) when they are bijections? 

6-119. Fix k € Zyo. Prove that every m € Zso can be written in exactly one way in the 
form m = ear Cs where 0 < iy < ig <-++ < dg. 

6-120. Fix k € Zso. Use the previous exercise to find an explicit formula for a bijection 
f3 ZK — Zso. 


6-121. Suppose we rewrite the recursion for Stirling numbers in the form 


S(n,k) =S(n—-1,k)k+S(n—1,k—1)  foralln,k>0. 


(a) Use the Bijective Product Rule and Sum Rule (taking terms in the order written here) 
to devise ranking and unranking algorithms for set partitions in SP(n,k). (b) Rank the 
partition 7 = {{1,7}, {2,4, 5}, {3, 8}, {6}} and unrank 111 to obtain an element of SP(7, 3) 
(cf. Example 6.27). (c) Repeat Exercise 6-30 using the new ranking algorithm. 


6-122. (a) Use the pruning bijections in §3.12 to develop ranking and unranking algo- 
rithms for the set of trees with vertex set {v1,...,Un} such that deg(v;) = d; for all 
i (where d1,...,d, are positive integers summing to 2n — 2). (b) Given (di,...,d9) = 
(1,2,1,1,1,2,3,1,4), find the rank of the tree shown in Figure 3.15. (c) Unrank 129 to 
obtain a tree with the degrees d; from part (b). 


6-123. Describe a successor algorithm for ranking rooted trees with vertex set {1,2,...,n} 
rooted at vertex 1. Compute the successor of the tree shown in Figure 3.9. 


6-124. Given a totally ordered finite set S = {89 < 81 < +++ < Sn -1}, a predecessor 
algorithm for S consists of subroutines first, last, and prev, where first(S) = 50, 
last(S) = s,-1, and prev(s;,S) = s;-1 for 0 < i < n—1. Formulate versions of the 
Bijection Rule, Sum Rule, and Product Rule for creating predecessor algorithms. 


6-125. (a) Write an algorithm for finding the predecessor of a word w € R(aj!---a;*), 
listing anagrams alphabetically. (b) Find the ten anagrams preceding cbbadcbabcb. 


6-126. (a) Write an algorithm for finding the predecessor of a Dyck path based on the first- 
return recursion (6.4). (b) Find the ten Dyck paths preceding NNNENEENNEENNENEEE. 


6-127. Devise a ranking algorithm for four-letter words in which Q is always followed by 
U (so Q cannot be the last letter). Use your algorithm to rank AQUA and QUIT and to 
unrank 1000. Can you find an algorithm that generates these words in alphabetical order? 
Can you generalize to n-letter words? 

6-128. Devise a ranking algorithm for five-letter words that never have two consecutive 
vowels. Use your algorithm to rank BILBO and THIRD and to unrank 9999. Can you 
find an algorithm that generates these words in alphabetical order? Can you generalize to 
n-letter words? 
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ee 
Notes 


Our presentation of ranking and unranking maps emphasizes the automatic construction 
of bijections via the Bijective Sum Rule and the Bijective Product Rule. For a somewhat 
different approach based on a multigraph model, see the papers [134, 135]. Other discussions 
of ranking and related problems can be found in the texts [9, 95, 122]. An encyclopedic 
treatment of algorithms for generating combinatorial objects may be found in Knuth’s 
comprehensive treatise [73, §7.2]. The exposition of successor algorithms in this chapter is 
based on [81]. 
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Groups, Permutations, and Group Actions 


This chapter contains an introduction to some aspects of group theory that are directly 
related to combinatorial problems. The first part of the chapter defines the initial concepts 
of group theory and derives some fundamental properties of permutations and symmetric 
groups. The second part of the chapter discusses group actions, which have many applica- 
tions to algebra and combinatorics. In particular, group actions can be used to solve counting 
problems in which symmetry must be taken into account. For example, how many ways can 
we color a 5 x5 chessboard with seven colors if all rotations and reflections of a given colored 
board are considered the same? The theory of group actions provides systematic methods 
for solving problems like this one. 


——— 


7.1 Definition and Examples of Groups 


A group is an abstract structure in which any two elements can be combined using an 
operation (analogous to addition or multiplication of numbers) obeying the algebraic axioms 
in the following definition. 


7.1. Definition: Groups. A group is a set G with a binary operation x*:GxG—-G 
satisfying these axioms: 


Va,y,z€ G,ax(yxz) =(axy)xz (associativity); 
dee G,VreGyrxe=r=exxr (identity); 
Va € G,dy € Gyaxy=e=yru (inverses). 


The requirement that « map G x G into G is often stated explicitly as the following axiom: 
Va,yeEG,rxyEeG (closure). 
A group G is called Abelian or commutative iff G satisfies this additional axiom: 
Va,ye G,rxy=yxur (commutativity). 


7.2. Example: Additive Groups. The set Z of all integers, with addition as the operation, 
is a commutative group. The identity element is e = 0, and the (additive) inverse of « € Z 
is —a € Z. Similarly, Q and R and C are all commutative groups under addition. The set 
Zs is not a group under addition because there is no identity element in the set Zyo9. The 
set Z>o is not a group under addition because 1 is in Z59, but 1 has no additive inverse 
in the set Z>o. The three-element set S = {—1,0,1} is not a group under addition because 
closure fails: 1€ Sand 1eS,but1+1=2¢S. 


7.3. Example: Multiplicative Groups. The set Qyo of strictly positive rational numbers 
is acommutative group under multiplication. The identity element is e = 1, and the inverse 
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of a/b € Qso is b/a € Qso. Similarly, the set Ryo of strictly positive real numbers is a 
group under multiplication. The set Q is not a group under multiplication because 0 has no 
inverse. On the other hand, Q—{0}, R—{0}, and C—{0} are groups under multiplication. 
So is the two-element set {—1, 1} C Q and the four-element set {1,7,—1,—7} C C. 


7.4, Example: Symmetric Groups. Let X be any set, and let Sym(X) be the set of 
all bijections f : X > X. For f,g € Sym(X), define fog: X — X to be the composite 
function that sends « € X to f(g(x)). Then fog € Sym(X) since the composition of 
bijections is a bijection, so the closure axiom holds. Given f,g,h € Sym(X), note that both 
of the functions (fog)oh: X + X and fo(goh):X > X send x € X to f(g(h(az))). So 
these functions are equal, proving the associativity axiom. Next, take e to be the bijection 
idx : X — X, which is defined by idx (x) = a for all x € X. One immediately checks that 
foidxy = f = idx of for all f € Sym(X), so the identity axiom holds. Finally, given a 
bijection f € Sym(X), there exists an inverse function f~1 : X — X that is also a bijection, 
and which satisfies f o f~! = idx = f~!o f. So the inverse axiom holds. This completes 
the verification that Sym(X) is a group. This group is called the symmetric group on X, 
and elements of Sym(X) are called permutations of X. Symmetric groups play a central 
role in group theory and are closely related to group actions. In the special case when 
X = {1,2,...,n}, we write S;, to denote the group Sym(X). 
Most of the groups Sym(X) are not commutative. For instance, define f,g € S3 by 


fl) =2, f2)=1, fF) =3; gM) =8, g(2) = 2, g(3) = 1. 


We see that (f © 9)(1) = f(g(1)) = 3, whereas (g 0 f)(1) = g(f(1)) = 2. $0 fog #gof, 
and the commutativity axiom fails. 


7.5. Example: Integers Modulo n. Let n be a fixed positive integer. Let Z, be the set 
[rn] = {0,1,2,...,n— 1}. We define a binary operation on Z, by setting, for all x,y € Zn, 


_ jf vty ife+y<n; 
voy={ xty-—n ifea+y>n. 


Closure follows from this definition, once we note that 0 <a+y < 2n—2 forall x,y € Zp. 
The identity element is 0. The inverse of 0 is 0, while for z > 0 in Z,, the inverse of x is 
n—«x € Z,. Associativity follows from the relations 


gtyt+zZz ife+tytz<n; 
(t@y)@z=4 awtyt2z—-n ifn<atytz<2n; =2r@(y@2z), (7.1) 
et+tytz—-2Qn ifQn<ae+yt+2z<3n; 


which can be established by a tedious case analysis. Commutativity of 6 follows from the 
definition and the commutativity of ordinary integer addition. We conclude that Z,, is an 
additive commutative group containing n elements. In particular, for every positive integer 
n, there exists a group of cardinality n. 


7.6. Definition: Multiplication Tables. Given a finite group G = {a1,...,a,} with 
operation «x, a multiplication table for G is an n x n table, with rows and columns labeled 
by the group elements x; € G, such that the element in row 7 and column j is x;*«;. When 
the operation is written additively, we refer to this table as the addition table for G. It is 
customary, but not mandatory, to take x, to be the identity element of G. 


7.7. Example. The multiplication table for {1,7,—1,—7} C C and the addition table for 
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Za4 are shown here: 


The reader may notice a relationship between the two tables: each row within the table is 
obtained from the preceding one by a cyclic shift one step to the left. Using terminology 
to be discussed later, this happens because each of the two groups under consideration is 
cyclic of size 4. 

One can define a group operation by specifying its multiplication table. For example, 
here is the table for another group of size 4, which turns out not to be cyclic: 


The identity and inverse axioms can be checked from inspection of the table; we see that 
a is the identity, and every element is equal to its own inverse. There is no quick way to 
verify associativity by visual inspection of the multiplication table, but this axiom can be 
checked exhaustively using the table entries. 

All of the groups in this example are commutative. This can be seen from the multipli- 
cation tables by noting the symmetry about the main diagonal line: the entry x; «x; in row 
z and column j always equals the entry x; * 7; in row j and column 7. 


(Ee 


7.2 Basic Properties of Groups 


We now collect some facts about groups that follow from the defining axioms. 

First, the identity element e in a group G is unique. For, suppose e’ € G also satisfies 
the identity axiom. On one hand, exe’ = € since e’ is an identity element. On the other 
hand, ex e’ = e’ since e is an identity element. So e = e’. We use the symbol eg to denote 
the unique identity element of an abstract group G. When the operation is addition or 
multiplication, we write 0g or 1g instead, dropping the G if it is understood from context. 

Similarly, the inverse of an element x in a group G is unique. For suppose y, y’ € G both 
satisfy the condition in the inverse axiom. Using the associativity axiom and the identity 
axiom, 

y=yxe=yx(rxy') =(yxa)ry sexy ay’. 
We denote the unique inverse of x in G by the symbol x~!. When the operation is written 
additively, the symbol —< is used. 

A product such as xxy is often written xy, except in the additive case. The associativity 
axiom can be used to show that any parenthesization of a product 2122---Z, gives the 
same answer, so it is permissible to omit parentheses in products like these. 


7.8. Theorem: Cancellation Laws and Inverse Rules. For all a,z,y in a group G: 


(a) if aw = ay then x = y (left cancellation); (b) if za = ya then x = y (right cancellation); 


(c) (2~t)~1 = x (double inverse rule); (d) (zy)~! = y~ta~! (inverse rule for products). 
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Proof. For (a), fix a,x,y € G and assume ax = ay; we must prove x = y. Multiply both 
sides of az = ay on the left by a~! to get a~!(ax) = a~1(ay). Then the associativity axiom 
gives (a-'a)x = (a~‘a)y; the inverse axiom gives ex = ey; and the identity axiom gives 
x =y. The right cancellation law (b) is proved similarly. For (c), note that 

Gye aaa ae 
by the definition of the inverse of 2~+ and the inverse of x. Right cancellation of x~! yields 
(x~')~! = x. Similarly, routine calculations using the group axioms show that 


(xy) *(ay) =e = (y*a~")(zy), 
so right cancellation of zy gives the inverse rule for products. O 


7.9. Definition: Exponent Notation. Let G be a group written multiplicatively. Given 
x € G, recursively define x° = 1 = eg and x"*! = 2” x x for all n > 0. Define negative 
powers of x by 2~” = (a~+)” for all n > 0, where the x~! on the right side denotes the 
inverse of x in G. 


Informally, for positive n, x” is the product of n copies of x. For negative n, x” is the 
product of |n| copies of the inverse of x. Note in particular that 21 = x, and the negative 
power «~' (as just defined) does reduce to the inverse of 2. When G is written additively, 
we write nz instead of x”; this denotes the sum of n copies of x for n > 0, or the sum of 
|n| copies of —a for n < 0. 


7.10. Theorem: Laws of Exponents. Suppose G is a group, x € G, and m,n € Z. In 
multiplicative notation, «7”t" = a™a2" and a” = (x")". If x,y € G satisfy ry = yx, then 
(zy)” = «"y”. In additive notation, these results read: (m+n)x = ma+nz; (mn)x = m(nx); 


and ifa+y=y+a, then n(x+y)=nar+ny. 


We omit the proof; the main idea is to use induction to establish the results for m,n > 0, 
and then use case analyses to handle the situations where m or n is negative. 


DT 


7.3. Notation for Permutations 


Permutations and symmetric groups appear frequently in the theory of groups and group 
actions. So it will be helpful to develop some notation for describing and visualizing per- 
mutations. 


7.11. Definition: Two-Line Form of a Function. Let X be a finite set, and let 
1,---,%, be a list of all the distinct elements of X in some fixed order. The two-line 
form of a function f : X — X relative to this ordering is the array 


| a 2 a oe 

f(i) f(t2) +++ f(t) J 

If X = {1,2,...,n}, we usually display the elements of X on the top line in the order 
1,2 n. 


pS eens 


7.12. Example. The notation f = | | defines a function on the set X = 
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{a,b,c,d,e} such that f(a) = b, f(b) =c, f(c) =e, f(d) =a, and f(e) = b. This function 
is not a permutation, since b occurs twice in the bottom row, and d never occurs. 

12 3 4 5 
24 5 1 8 
g(2) = 4, g(3) = 5, g(4) = 1, and g(5) = 3. Observe that the inverse of g sends 2 to 1, 4 to 
2, and so on. So, we obtain one possible two-line form of g~! by interchanging the rows in 
the two-line form of g: 


The notation g = | defines an element of S5 such that g(1) = 2, 


[24518 
g 1-2 3 4 5:|* 


It is customary to write the numbers in the top line in increasing order. This can be 
accomplished by sorting the columns of the previous array: 


+. | t 23 4 5 
9 "la 152 3)" 


Recall that the group operation in Sym(X) is composition. We can compute the composition 
of two functions written in two-line form by tracing the effect of the composite function on 
each element. For instance, 


abed a bedj)| _|abed 
bdacl°laedob| /bae d|’ 
because the left side maps a to a and then to b; b maps to c and then to a; and so on. 


If the ordering of X is fixed and known from context, we may omit the top line of the 
two-line form. This leads to one-line notation for a function defined on X. 


7.13. Definition: One-Line Form of a Function. Let X = {a < 1 <--: < a} 
be a finite totally ordered set. The one-line form of a function f : X — X is the array 


[F(a1) f(@2)-+- f@n)I- 


We use square brackets to avoid a conflict with the cycle notation to be introduced 
below. Sometimes we omit the brackets, identifying f with the word f(x1)f(x2)--- f (an). 


7.14. Example. The functions f and g in the preceding example are given in one-line 
form by writing f = [bc e a b] and g = [2 45 1 3]. In word notation, f = bceab and 
g = 24513. Note that the one-line form of an element of Sym(X) is literally a permutation 
(rearrangement) of the elements of X, as defined in §1.3. This explains why elements of this 
group are called permutations. 


7.15. Cycle Notation for Permutations. Assume X is a finite set. Recall from §3.6 
that any function f : X — X can be represented by a digraph with vertex set X and a 
directed edge (i, f(i)) for each i in X. An arbitrary digraph on X is the digraph of some 
function f : X — X iff every vertex in X has outdegree 1. In Theorem 3.45 we proved that 
the digraph of a permutation is a disjoint union of directed cycles. For example, Figure 7.1 
displays the digraph of the permutation 


1 2 8 

h == 
7 8 4 
We can describe a directed cycle in a digraph by traversing the edges in the cycle and 
listing the elements we encounter in the order of visitation, enclosing the whole list in 
parentheses. For example, the cycle containing 1 in Figure 7.1 can be described by writing 


282 Combinatorics, Second Edition 


FIGURE 7.1 
Digraph associated to the permutation h. 


(1,7,5,10). The cycle containing 9 is denoted by (9). To describe the entire digraph of a 
permutation, we list all the cycles in the digraph, one after the other. For example, h can 
be written in cycle notation as 


h = (1,7,5, 10)(2, 8, 6)(3, 4)(9). 


This cycle notation is not unique. We are free to begin our description of each cycle at any 
vertex in the cycle, and we may also rearrange the order of the cycles. Furthermore, by 
convention it is permissible to omit some or all cycles of length 1. For example, some other 
cycle notations for h are 


h = (5, 10, 1, 7)(3, 4)(9)(6, 2,8) = (2,8, 6)(4,3)(7, 5, 10, 1). 


To compute the inverse of a permutation written in cycle notation, we reverse the orientation 
of each cycle. For example, 


h-* = (10,5,'7,1)(6, 8, 2)(4, 3) (9). 
7.16. Example. Using word notation, the group S3 consists of these six elements: 
S3 = {123, 213, 321, 132, 231, 312}. 
Using cycle notation, we can describe the elements of $3 as follows: 


S3 = {(1)(2)(3), (1,2), (1,3), (2,3), (1, 2,3), (1,8, 2)}- 


7.17. Example. To compose permutations written in cycle notation, we must see how the 
composite function acts on each element. For instance, consider the product (3,5)(1, 2,4) © 
(3,5, 2,1) in S;. This composite function sends 1 to 3 and then 3 to 5, so 1 maps to 5. Next, 
2 maps first to 1 and then to 2, so 2 maps to 2. Continuing similarly, we find that 


12 3 4 5 


(3,5)(1,2,4)0(3,5,2,1)=| 5 5 3 1 4 


= (1,5, 4)(2)(3). 
With enough practice, one can proceed immediately to the cycle form of the answer without 
writing the two-line form or doing other scratch work. 


7.18. Definition: k-cycles. For k > 1, a k-cycle is a permutation f € Sym(X) whose 
digraph consists of one cycle of length & and all other cycles of length 1. 


7.19. Remark. We can view the cycle notation for a permutation f as a way of factorizing 
f in the group S,, into a product of cycles. For example, 


(1, 7,5, 10)(2, 8, 6)(3, 4)(9) = (1,7, 5, 10) o (2, 8,6) o (3, 4) 0 (9). 


Here we have expressed the single permutation on the left side as a product of four other 
permutations in Sj9. The stated equality may be verified by checking that both sides have 
the same effect on each x € {1,2,..., 10}. 
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7.20. Definition: cyc(f) and type(f). Given a permutation f € Sym(X), let cyc(f) be 
the number of components (cycles) in the digraph for f. Let type(f) be the list of sizes of 
these components, including repetitions, and written in weakly decreasing order. 


Note that type(f) is an integer partition of n = |X]. 


7.21. Example. The permutation h in Figure 7.1 has cyc(h) = 4 and type(h) = (4,3, 2,1). 
The identity element of S;,,, namely id = (1)(2)---(m), has cyc(id) = n and type(id) = 
(1,...,1) = (1"). Table 7.1 displays the 24 elements of $4 in cycle notation, collecting 
together all permutations with the same type and counting the number of permutations of 
each type. In Theorem 7.115, we give a general formula for the number of permutations of 
n objects having a given type. 


Type Permutations Count 


TABLE 7.1 
Elements of S4. 


DS 


7.4 Inversions and Sign of a Permutation 


In this section, we use inversions of permutations to define the sign function sgn : S, > 
{+1,—1}. We then study factorizations of permutations into products of transpositions to 
derive facts about the sgn function. 


7.22. Definition: Inversions and Sign of a Permutation. Let w = wiwe-:- Wn € Sy 
be a permutation written in one-line form. An inversion of w is a pair of positions (i, 7) 
such that i < j and w; > w;. The number of inversions of w is denoted inv(w). The sign of 
w is defined to be sen(w) = (—1)'"V™). 


7.23. Example. Given w = 42531, we have inv(w) = 7 and sgn(w) = —1. The seven 
inversions of w are (1,2), (1,4), (1,5), (2,5), (3,4), (8,5), and (4,5). For instance, (1,4) is 
an inversion because wy > w4 (4 > 3). Table 7.2 displays inv(f) and sgn(f) for all f € S3. 


Our next goal is to understand how the group operation in S,, (composition of permuta- 
tions) is related to inversions and sign. For this purpose, we introduce special permutations 
called transpositions. 


7.24. Definition: Transpositions. A transposition in S,, is a permutation f of the form 
(i, 7), for some i 4 7 in {1,2,...,n}. A basic transposition in S;, is a transposition (7,2 + 1), 
for some i € {1,2,...,n— 1}. 
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TABLE 7.2 
Computing inv(f) and sgn(f) for f € Ss. 


Note that the transposition f = (7,7) satisfies f(¢) = 7, f(j) = 7, and f(k) = k for all 
k #1,7. The following lemmas illuminate the connection between basic transpositions and 
the process of sorting the one-line form of a permutation into increasing order. 


7.25. Lemma: Basic Transpositions and Sorting. Let w = w1-:-wiwi4i-+:Wn © Sn 
be a permutation in one-line form. For each i € {1,2,...,n—1}, 


wo (i,t +1) = wy: wig Wie Wn. 


So right-multiplication by the basic transposition (i,i + 1) interchanges the elements in 
positions 1 andi+1 ofw. 


Proof. Let us evaluate the function f = wo(i,i+1) at each k between 1 and n. When k = i, 
f(t) =wtt+1). When k =i+1, f(@+1) =w(i). When kk Ai and k Ai+1, f(k) = w(k). 
So the one-line form of f is wy --- Wj41W; ++: Wn, as needed. O 


7.26. Lemma: Basic Transpositions and Inversions. Let w = w,---wn € Sp bea 
permutation in one-line form. For each i € {1,2,...,n — 1}, 


inv(w) +1 if wy < wit; 
inv(w) —1 if wy > wis. 


inv(wo (4,¢+1)) = { 
Consequently, in all cases, we have 
sen(w o (4,4 + 1)) = —sgn(w). 


Proof. We use the result of the previous lemma to compare the inversions of w and w’ = 
wo (i,i+1). Let 7 < k be two indices between 1 and n, and consider various cases. First, 
ifj7 Ai,i+1 and k 4i,i+4+1, then (j,k) is an inversion of w iff (j,&) is an inversion of w’, 
since wj = w; and wz = w;,. Second, if 7 =i and k >i +1, then (i,k) is an inversion of w 
iff (¢ + 1,k) is an inversion of w’, since w; = wi, and wz = w;,. Similar results hold in the 
cases (j =i+1<k), (j <k =%), and (j <i, k =i+1). The critical case is when 7 = 7 and 
k=i+1. If wi < wi41, then (j,k) is an inversion of w’ but not of w. If w; > wi41, then 
(j, &) is an inversion of w but not of w’. This establishes the first formula in the lemma. The 
remaining formula follows since sgn(w’) = (—1)@v™)+1 = (—1)i=v™)(—1) =—sen(w). O 


The next lemma follows immediately from the definitions. 
7.27. Lemma. For all n > 1, the identity permutation id = [12 ... n] is the unique element 
of S,, satisfying inv(id) = 0. Also, sgn(id) = +1. 


If f = (i,i+1) is a basic transposition, then the ordered pair (¢,1+1) is the only inversion 
of f, so inv(f) = 1 and sgn(f) = —1. More generally, we now show that any transposition 
has sign —1. 
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7.28. Lemma. If f = (7,7) is any transposition in S,,, then sgn(f) = —1. 


Proof. Since (i,j) = (j,i), we may assume that i < j. Let us write f in two-line form: 


Dives G cee Gove on 


We can find the inversions of f by inspecting the two-line form. The inversions are: all (i, k) 
with i <k <j; and all (k, 7) with i << k < j. There are j —i inversions of the first type and 
j —t—1 inversions of the second type, hence 2(j — 7) — 1 inversions total. Since this number 
is odd, we conclude that sgn(f) = —1. oO 


7.29. Theorem: Inversions and Sorting. Let w = wiw2---Wn € Sp, be a permutation in 
one-line form. The number inv(w) is the minimum number of steps required to sort the word 
w into increasing order by repeatedly interchanging two adjacent elements. Furthermore, w 
can be factored in S,, into the product of inv(w) basic transpositions. 


Proof. Given w € Sp, it is certainly possible to sort w into increasing order in finitely 
many steps by repeatedly swapping adjacent elements. For instance, we can move 1 to the 
far left position in at most n — 1 moves, then move 2 to its proper position in at most 
nm — 2 moves, and so on. Let m be the minimum number of moves of this kind that are 
needed to sort w. By Lemma 7.25, we can accomplish each sorting move by starting with 
w and repeatedly multiplying on the right by an appropriately chosen basic transposition. 
Each such multiplication either increases or decreases the inversion count by 1, according to 
Lemma 7.26. At the end, we have transformed w into the identity permutation. Combining 
these observations, we see that 0 = inv(id) > inv(w)—™m, so that m > inv(w). On the other 
hand, consider the following particular sequence of sorting moves starting from w. If the 
current permutation w* is not the identity, there exists a smallest index 7 with w; > w7,,. 
Apply the basic transposition (i,i + 1), which reduces inv(w*) by 1, and continue. This 
sorting method will end in exactly inv(w) steps, since id is the unique permutation with 
zero inversions. This proves it is possible to sort w in inv(w) steps, so that m < inv(w). 

To prove the last part of the theorem, recall that the sorting process just described can 
be implemented by right-multiplying by basic transpositions. We therefore have an equation 
in S,, of the form 


w 0 (i1, 41 +1) 0 (42,72 +1) 0-++0 (im, tm +1) = id. 
Solving for w, and using the fact that (7, 7)~' = (j,i) = (7,7), we get 
W = (ims tm +1) 0-+-++0 (t2,42 +1) 0 (i1,%1 +1), 
which expresses w as a product of m basic transpositions. O 


7.30. Example. Let us trace through the sorting algorithm in the preceding proof to write 
w = 42531 as a product of inv(w) = 7 basic transpositions. Since 4 > 2, we first multiply w 
on the right by (1,2) to obtain 

wo (1,2) = 24531. 


Observe that inv(24531) = 6 = inv(w) — 1. Next, since 5 > 3, we multiply on the right by 
(3,4) to get 
o (1,2) 0 (3,4) = 24351. 


286 Combinatorics, Second Edition 


The computation continues as follows: 


wo (1,2)0(3,4)0 (2,3) = 23451; 
wo (1,2) 0 (3,4) 0(2,3)0(4,5) = 23415; 
wo (1,2) 0 (3,4) 0 (2,3) 0(4,5)0(3,4) = 23145; 
wo (1,2) 0 (3,4) 0 (2,3) 0 (4,5) 0 (3,4) 0 (2,3) = 21345; 
(1,2) 0 (3, 4) 0 (2,3) 0 (4,5) 0 (3,4) 0 (2,3)0 (1,2) = 12345 =id. 


We now solve for w, which has the effect of reversing the order of the basic transpositions 
used to reach the identity: 


w = (1,2) 0 (2,3) 0 (3,4) 0 (4,5) 0 (2,3) 0 (3,4) o (1, 2). 


It is also possible to find such a factorization by starting with the identity word and unsorting 
to reach w. Here it will not be necessary to reverse the order of the transpositions at the 
end. We illustrate this idea with the following computation: 


id = 12345; 
ido(3,4) = 12435; 
id 0(3,4) 0 (2,3) = 14235; 
id 0(3, 4) 0 (2,3) 0 (1,2) = 41235; 
id 0(3, 4) 0 (2,3) 0 (1,2) 0 (2,3) = 42135; 
id 0(3, 4) 0 (2,3) 0 (1,2) 0 (2,3) 0 (4,5) = 42153; 
id 0(3, 4) 0 (2,3) 0 (1,2) 0 (2,3) 0 (4,5) 0 (3,4) = 42513; 
id 0(3, 4) 0 (2,3) 0 (1,2) 0 (2,3) 0 (4,5) 0 (3,4) 0 (4,5) = 42531 =w. 


So w = (3,4) o (2,3) 0 (1,2) o (2,3) 0 (4,5) 0 (3,4) o (4,5). Observe that this is a different 
factorization of w from the one obtained earlier, although both involve seven basic transpo- 
sitions. This shows that factorizations of permutations into products of basic transpositions 
are not unique. It is also possible to find factorizations involving more than seven factors, 
by interchanging two entries that are already in the correct order during the sorting of w 
into id. So the number of factors in such factorizations is not unique either; but we will see 
shortly that the parity of the number of factors (odd or even) is uniquely determined by w. 
In fact, the parity is odd when sgn(w) = —1 and even when sgn(w) = +1. 


We now have enough machinery to prove the fundamental properties of the sign function. 


7.31. Theorem: Properties of Sign. (a) For all f,g € S,, sgn(f og) = sgn(f) - sgn(g). 
(b) For all f € S,,, sen(f—) =sgn(f). 


Proof. (a) If g = id, then the result is true since f og = f and sgn(g) = 1 in this case. If 
t = (i,i + 1) is a basic transposition, then Lemma 7.26 shows that sgn(f ot) = —sgn(f). 
Given a non-identity permutation g, use Theorem 7.29 to write g as a nonempty product 
of basic transpositions, say g = t; otg0---ot,. Then, for every f € S;,, repeated use of 
Lemma 7.26 gives 


sen(fog) = sgn(ftr---t,-1th) = —sgn(ftr---te—1) 
(—1)? sgn(fti ---tp_2) = --- = sgn(f)(—1)*. 


In particular, this equation is true when f = id; in that case, we obtain sgn(g) = (—1)*. 
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Using this fact in the preceding equation (for arbitrary f € S,,) produces sgn(f og) = 
sgn(f) sgn(q). 

(b) By part (a), sgn(f) -san(f-?) = sen(fo f7}) = sgn(id) = +1. If sgn(f) = +1, it 
follows that sgn(f~1) = +1. If instead sgn(f) = —1, then it follows that sgn(f~!) = —1. So 
sen(f—') = sgn(f) in both cases. oO 


Iteration of Theorem 7.31 shows that 
k 
sgn(f1 o---° fx) = [[sen(fi). (7.2) 
i=l 


7.32. Theorem: Factorizations into Transpositions. Let f = t; otg0---ot, be any 
factorization of f € S,, into a product of transpositions (not necessarily basic ones). Then 
sen(f) = (—1)*. In particular, the parity of k (odd or even) is uniquely determined by f. 


Proof. By Lemma 7.28, sgn(t;) = —1 for all 7. The conclusion now follows by setting f; = t; 
in (7.2). Oo 


7.33. Theorem: Sign of a k-cycle. The sign of any k-cycle (i1,%2,..., ix) is (—1)*7!. 


Proof. The result is already known for k = 1 and k = 2. For k > 2, one may check that the 
given k-cycle can be written as the following product of k — 1 transpositions: 


(i1,%2,--.,%%) = (t1, 42) © (#2, ¢3) © (43, i4) 0+ ++ (ie—1, tx): 
So the result follows from Theorem 7.32. O 
We now show that the sign of a permutation f can be computed from type(f) or cyc(f). 


7.34. Theorem: Cycle Type and Sign. Suppose f € S;, has type(f) = uw. Then 


> 


() 
sen(f) = [[ (it = (-1)"- = (-1r-ovel), 
i=1 
Proof. Let the cycle decomposition of f be f = C1 0--+0 Cy), where C; is a ju;-cycle. The 
result follows from the relations sgn(f) = iWbaeg sen(C;) and sgn(C;) = (—1)#71. oO 


7.35. Example. The permutation f = (4,6,2,8)(3,9,1)(5,10,7) in Si9 has type(f) = 
(4, 3, 3), cyc(f) _ 3, and sen(f) _ ei =-l. 


DT 


7.5 Subgroups 


Suppose G is a group with operation * : G x G > G, and H is a subset of G. One might 
wonder if H is also a group with operation obtained by restricting * to the domain H x H. 
This construction succeeds when H is a subgroup, as defined below. 


7.36. Definition: Subgroups. Let G be a group with operation x, and let H be a subset 
of G. H is called a subgroup of G, written H < G, iff H satisfies the following three closure 


conditions: 
eg CH (closure under identity); 


Va,b€ H,axb€H (closure under the operation); 
Vae H,a teH (closure under inverses). 
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A subgroup H is called normal in G, written H 4G, iff 
Vg €G,VhE H,ghg 'CH (closure under conjugation). 


Let us verify that when H is a subgroup of G, H really is a group using the restriction *’ 
of x to H x H. Since H is closed under the operation, *’ does map H x H into the codomain 
H (not just into G), so the closure axiom holds for H. Since H is a subset of G, associativity 
holds in H because it is known to hold in G. The identity e of G lies in H by assumption. 
Since ex g = g = gxe holds for all g € G, the relation ex’ h = h = hx’ e certainly holds for 
allh € H CG. Finally, every element x of H has an inverse y (relative to x) that lies in H, 
by assumption. Now y is still an inverse of x relative to *’, so the inverse axiom holds. This 
completes the proof of the group axioms for H. It can be checked that if G is commutative 
then H is commutative, but the converse statement is not always true. Henceforth, we use 
the same symbol x (instead of x’) to denote the operation in the subgroup H. 


7.37. Example. We have the following chain of subgroups of the additive group C: 
{0} <Z<Q<R<C. 


Likewise, {—1,1} and Qso are both subgroups of Q—{0} under multiplication. The set 
{0,3,6,9} is a subgroup of the additive group Zi2; one can prove closure under addition 
and inverses by a finite case analysis, or by inspection of the relevant portion of the addition 
table for Z12. On the other hand, {0,3,6,9} is not a subgroup of Z, since this set is not 
closed under addition or inverses. 


7.38. Example. The sets H = {(1)(2)(3), (1, 2,3), (1,3,2)} and K = {(1)(2)(3), (1,3)} are 
subgroups of $3, as one readily verifies. Moreover, H is normal in $3, but K is not. The set 
J = {(1)(2)(3), (1,3), (2,3), (1, 3)} is not a subgroup of $3, since closure under the operation 
fails: (1,3) and (2,3) are in J, but (1,3) 0 (2,3) = (1,3, 2) is not in J. Here is a four-element 
normal subgroup of 54: 


V = {(1)(2)(3)(4), (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3) f- 


Each element of V is its own inverse, and one confirms closure of V under the operation 
by checking all possible products. To prove the normality of V in S4, it is helpful to use 
Theorem 7.112 later in this chapter. 


7.39. Example. The set of even integers is a subgroup of the additive group Z. This follows 
because the identity element zero is even; the sum of two even integers is even; and if x is 
even then —2 is even. More generally, let k be any fixed integer, and let H = {kn: n € Z} 
consist of all integer multiples of k. A routine verification shows that H is a subgroup of Z. 
We write H = kZ for brevity. The next theorem shows that we have found all the subgroups 
of Z. 


7.40. Theorem: Subgroups of Z. Every subgroup H of the additive group Z has the 
form kZ for a unique integer k > 0. 


Proof. We have noted that all the subsets kZ are indeed subgroups. Given an arbitrary 
subgroup H, consider two cases. If H = {0}, then H = 0Z. Otherwise, H contains at least 
one nonzero integer m. If m is negative, then —m € H since H is closed under inverses. So, 
H contains strictly positive integers. Take k to be the least positive integer in H. We claim 
that H = kZ. Let us prove that kn € H for all n € Z, so that kZ C H. For n > 0, we use 
induction on n. When n = 0, we must prove k0 = 0 € H, which holds since H contains 
the identity of Z. When n = 1, we must prove kl = k € H, which is true by choice of 
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k. Assume n > 1 and kn € H. Then k(n+1) =kn+k € H since kn € H,k € H, and 
HZ is closed under addition. Finally, for negative n, write n = —m for some m > 0. Then 
kn = —(km) € H since km € H and H is closed under inverses. 

The key step is to prove the reverse inclusion H C kZ. Fix z € H. Dividing z by k, 
we obtain z = kq +r for some integers g,r with 0 < r < k. By what we proved in the 
last paragraph, k(—q) € H. So, r = z—kq = z+ k(-q) € H since H is closed under 
addition. Now, since k is the least positive integer in H, we cannot have 0 < r < k. The 
only possibility left is r = 0, so z = kq € kZ, as needed. 

To prove that k is uniquely determined by H, suppose kZ = mZ for k,m > 0. Now 
k = 0 iff m = 0, so we may assume k,m > 0. Since k € kZ = mZ, k is a multiple of m. 
Similarly, m is a multiple of k. As both k and m are positive, this forces k = m, completing 
the proof. O 


How can we find subgroups of a given group G? As we see next, each element x € G can 
be used to produce a subgroup of G. 


7.41. Definition: Cyclic Subgroups and Cyclic Groups. Let G be a multiplicative 
group, and let « € G. The cyclic subgroup of G generated by x is (x) = {a®: ne Z}.G 
is called a cyclic group iff there exists x € G with G = (x). When G is an additive group, 
(x) ={nx:neZ}. 


One checks, using the Laws of Exponents, that the subset (x) of G really is a subgroup. 


7.42. Example. The additive group Z is cyclic, since Z = (1) = (—1). The subgroups 
kZ = (k) considered above are cyclic subgroups of Z. Theorem 7.40 shows that every 
subgroup of Z is cyclic. The additive groups Z,, are also cyclic; each of these groups (for 
n > 1) is generated by 1. The group {a, b, c, d} with multiplication table given in Example 7.7 
is not cyclic. To prove this, we compute all the cyclic subgroups of this group: 


(a) = {a}, (b) = {a,b}, (c)={a,c}, (d) = {a,d}. 


None of the cyclic subgroups equals the whole group, so the group is not cyclic. For a bigger 
example of a non-cyclic group, consider Q under addition. Any nonzero cyclic subgroup has 
the form (a/b) for some positive rational number a/b. One may check that a/(2b) does not 
lie in this subgroup, so Q ¥ (a/b). Non-commutative groups furnish additional examples of 
non-cyclic groups, as the next result shows. 


7.43. Theorem. Every cyclic group is commutative. 


Proof. Let G = (x) be cyclic. Given y,z € G, we can write y = x” and z = a™ for some 
n,m € Z. Since integer addition is commutative, the Laws of Exponents give 


YZ= uu x x ue ZY. O 


By adapting the proof of Theorem 7.40, one can show that every subgroup of a cyclic 
group is cyclic. 


7.44, Example. The cyclic group Ze has the following cyclic subgroups (which are all the 
subgroups of this group): 


(0) = {0}; (1) = {0,1,2,3,4,5} = (5); (2) = {0,2,4} = (4); (3) = {0, 3}. 
In the group S4, (1,3,4,2) generates the cyclic subgroup 


(1, 3,4, 2)) = {(1,3,4,2), (1,4)(3,2), (1, 2,4,3), (1)(2)(3)(4)}. 
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7.6 Automorphism Groups of Graphs 


This section uses graphs to construct examples of subgroups of symmetric groups. These 
subgroups are needed later when we discuss applications of group theory to counting prob- 
lems. 


7.45. Definition: Automorphism Group of a Graph. Let K be a simple graph with 
vertex set X and edge set E. A graph automorphism of K is a bijection f : X — X such 
that, for all u,v in X, {u,v} € EF iff {f(u), f(v)} © EB. Let Aut(v) denote the set of all 
graph automorphisms of AK. We make an analogous definition for directed simple graphs; 
here, the bijection f must satisfy (u,v) € FE iff (f(u), f(v)) € E for all u,u € X. 


Using this definition, it can be checked that the automorphism group Aut(/) is a sub- 
group of the symmetric group Sym(X) under the operation o (function composition). In- 
formally, we can visualize a bijection f : X — X as a relabeling of the vertices that replaces 
each vertex label u € X by a new label f(w). Such an f is an automorphism of K iff the 
relabeled graph has exactly the same edges as the original graph. 


7.46. Example. Consider the graphs shown in Figure 7.2. The undirected cycle Cs has 


@ @) 
@) 


Cs Ds 


FIGURE 7.2 
Graphs used to illustrate automorphism groups. 


exactly ten automorphisms. They are given in one-line form in the following list: 


[12345], [23451], [84512], [45123], (51234, 
[54321], [43215], [82154], [21543], (15432). 


The same automorphisms, written in cycle notation, look like this: 


QING), G23,45), 53,5,24), 142.5,3), 05,43.) 
1,5)(2,4)(3),  (1,4)(2,3)(5), 13)(4,5)2),  (1,2)(3,5)(4),  (2,5)(8, 4) 0). 


Geometrically, we can think of Cs as a necklace with five beads. The first five automorphisms 
on each list can be implemented by rotating the necklace through various angles (rotation 
by zero is the identity map). The next five automorphisms arise by reflecting the necklace 
in five possible axes of symmetry. 

Now consider the automorphism group of the directed cycle D5. Every automorphism of 
the directed graph Ds is automatically an automorphism of the associated undirected graph 
Cs, so Aut(Ds) < Aut(Cs). However, not every automorphism of Cs is an automorphism 
of Ds. In this example, the five rotations preserve the direction of the edges, hence are 
automorphisms of D5. But the five reflections reverse the direction of the edges, so these are 


Groups, Permutations, and Group Actions 291 


not elements of Aut(D;). We can write Aut(Ds) = ((1, 2,3, 4,5)), so that this automorphism 
group is cyclic of size 5. 

The 6-cycle Cg can be analyzed in a similar way. The automorphism group consists of 
six rotations and six reflections, which are given in cycle form below: 


(1)(2)(3)(4)(5)(6), (1, 2,3,4,5,6), (1, 3,5)(2,4,6), (1,4) (2, 5)(3, 6), 
(1,5,3)(2, 6,4), (1,6,5,4,3,2),  (2,6)(38,5)(1)(4), (1, 2)(8, 6)(4, 5), 
(1, 3)(4,6)(2)(5), (2, 3)(5,6)(1, 4), (1, 5)(2, 4)(3)(6), (1,6) (2, 5)(3, 4). 


The observations in the previous example generalize as follows. 


7.47. Theorem: Automorphism Group of a Cycle. For n > 3, let C, be the 
graph with vertex set X = {1,2,...,n} and edges {1,2}, {2,3},...,{n—1,n}, {n, 1}. Then 
Aut(C;,) is a subgroup of S', of size 2n, called the dihedral group of order 2n. The elements 
of this group (in one-line form) are the n permutations 


fi, ¢+1,74+2,...n,1, 2,...i-]] for 1 <i<n, (7.3) 
together with the reversals of these n words. 


Proof. It is routine to check that the 2n permutations mentioned in the theorem preserve 
the edges of C;, and hence are automorphisms of this graph. We must show that these are 
the only automorphisms of C,,. Let g be any automorphism of C’,, and let 7 = g(1). Now, 
since 1 and 2 are adjacent in C,, g(1) and g(2) must also be adjacent in C’,. There are two 
cases: g(2) = i+1 or g(2) = i—1 (we read all outputs mod n, interpreting n+1 as 1 and 1—1 
as n). Suppose the first case occurs. Since 2 is adjacent to 3, we must have g(3) = i+ 2 or 
g(3) = 7. But i = g(1) and g is injective, so it must be that g(3) = 7+ 2. Continuing around 
the cycle in this way, we see that g must be one of the permutations displayed in (7.3). 
Similarly, in the case where g(2) = i — 1, we are forced to have g(3) = i — 2, g(4) =i-3, 
and so on, so that g must be the reversal of one of the permutations in (7.3). oO 


The reasoning used in the preceding proof can be adapted to determine the automor- 
phism groups of more complicated graphs. We stress that a graph automorphism does not 
necessarily arise from a rigid geometric motion such as a rotation or a reflection; the only 
requirement is that the vertex relabeling map preserves all edges. 


7.48. Example. Consider the graph B displayed in Figure 7.3, which models a 5 x 5 
chessboard. What are the automorphisms of B? We note that B has four vertices of degree 
2: a,e,v, and z. An automorphism ¢ of B must restrict to give a permutation of these four 
vertices, since automorphisms preserve degree. Suppose, for example, that ¢(a) = v. What 
can $(b) be in this situation? Answer: $(b) must be a vertex adjacent to ¢(a), namely q 
or w. In the case ¢(b) = q, ¢(c) = & is forced by degree considerations, whereas ¢(b) = w 
forces @(c) = x. Continuing this reasoning as we work around the outer border of the figure, 
we see that the action of ¢ on all vertices on the border is completely determined by where 
a and b go. A tedious but routine argument then shows that the images of the remaining 
vertices are also forced. Since a can map to one of the four corners, and then 6 can map to 
one of the two neighbors of ¢(a), there are at most 4 x 2 = 8 automorphisms of B. Here are 
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FIGURE 7.3 
Graph representing a 5 x 5 chessboard. 
the eight possibilities in cycle form: 
ro = (a)(6)(e)--: (x)(y)(z) = id; 
rr = (a,e,2,v)(b,9,y, 9)(c.p, v, k)(d, u, w, f)(g,7,t,r)(h,n, 8, 1)(m); 
r2 = (a,z)(b,y)(c,x)(d, w)(e, v) (9, a) (p, k)(u, f(g, t)(h os r)(n,1)(m); 
te > (a, v, 2, €)(b, 4g, ¥; oie, kx ,p)(d, f,w, es t, ,i)(h 11, 8, n)(m); 
Sy=0 = (a,v)(b,w)(c,x)(d,y)(e,2)(F,)(9, eae i,t)Q, u)(K)(0)(m)(n)(0): 
Sa-0 = (a,e)(b,d)(f,9)(9,t)(k, PL, n)(4, ae z)(w, y)(e)(h)(m)(s) (x); 
Sy=r = (2,2) (0, u)(e, p)(4, I) (fF, y)(9, (A, mit x)(1, s)(q, w)(e)(2)(m)(r)(v); 
Sy=-e = (6, f)(ek)(d ale, v)(h, IG, r) (9, w)(n, 8)(p, )(u, y)(a)(g)(m)(£) (2). 


One may check that all of these maps really are automorphisms of B, so | Aut(B)| = 8. 
This graph has essentially the same symmetries as the cycle Cy: four rotations and four 
reflections. (The subscripts of the reflections indicate the axis of reflection, taking m to be 
located at the origin.) By directing edges appropriately, we could produce a graph with only 
four automorphisms (the rotations). These graphs and groups play a crucial role in solving 
the chessboard-coloring problem mentioned in the introduction to this chapter. 


7.49. Example. As a final illustration of the calculation of an automorphism group, con- 
sider the graph C’ shown in Figure 7.4, which models a three-dimensional cube. We have 
taken the vertex set of C to be {0, 1}°, the set of binary words of length 3. Which bijections 
f : {0,1}8 > {0,1}% might be automorphisms of C? First, f(000) can be any of the eight 
vertices. Next, the three neighbors of 000 (namely 001, 010, and 100) can be mapped bijec- 
tively onto the three neighbors of f (000) in any of 3! = 6 ways. The images of the remaining 
four vertices are now uniquely determined, as one may check. By the Product Rule, there 
are at most 8 x 6 = 48 automorphisms of C. A routine but tedious verification shows that all 
of these potential automorphisms really are automorphisms, so | Aut(C)| = 48. By chance, 
all of these automorphisms can be implemented geometrically using appropriate rotations 
and reflections in three-dimensional space. Here are the six automorphisms of C' that send 


000 to 110: 
te 000 001 
TT! 110 100 
f= 000 001 
2~ 1! 110 100 


010 
111 


010 
010 


O11 
101 


O11 
000 


100 
010 


100 
111 


101 
000 


101 
101 


110) «111 
O11 O01 }’ 
110) 111 
O11 O01 |’ 
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100 101 
—— 
000 001 
110 111 
010 011 


FIGURE 7.4 
The cube graph. 


7.7 Group Homomorphisms 


We turn next to group homomorphisms, which are functions from one group to another 
group that preserve the algebraic structure. 


7.50. Definition: Group Homomorphisms. Let G and H be groups with operations 
* and e, respectively. A function f : G— H is called a group homomorphism iff 


f(cxy)=f(e)e fy)  forallz,yeG. 
A group isomorphism is a bijective group homomorphism. 


7.51. Example. Define f : R > Rso by f(x) = e® for all x € R. This function is a 
group homomorphism from the additive group R to the multiplicative group Ryo, since 
f(aty) =e*t¥ =e*-e¥ = f(x): f(y) for all z,y € R. In fact, f is a group isomorphism 
since g : Ryo > R, given by g(x) = Inz for x > 0, is a two-sided inverse for f. 


7.52. Example. Define h : C > R by h(a + ty) = & for all zy € R. It can be checked 
that h is an additive group homomorphism that is surjective but not injective. Next, define 
r:C—{0} > R—{0} by setting r(a + iy) = ja +iy| = 2? + y?. Given nonzero w = x + iy 
and z= u+iv with x, y,u,v € R, we calculate 
r(wz) = r((cu— yo) +i(yut+ av)) = Jf (cu — yr)? + (yu+ av)? 
(x? + y?)(u? + v?) = r(w)r(z). 


So r is a homomorphism of multiplicative groups. 
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7.53. Example. For any group G, the identity map idg : G > G is a group isomorphism. 
More generally, if H < G, then the inclusion map 7 : H — G (defined by j(h) = h for all 
h € H) isa group homomorphism. If f : G > K and g: K > P are group homomorphisms, 
then go f :G— P is a group homomorphism, since for all z,y € G, 


(9° f)(xy) = 9( f(xy) = o( f(z) Fy) = of (2))o(F(y)) = (9° f)(z)(g 0 f)(y). 


Moreover, go f is an isomorphism if f and g are isomorphisms, since the composition of 
bijections is a bijection. If f : G — K is an isomorphism, then f~! : K — G is also an 
isomorphism. For suppose u,v € K. Write « = f~'(u) and y = f~1(v), so u = f(x) and 
v = f(y). Since f is a group homomorphism, it follows that uv = f(xy). Applying f~+ to 
this relation, we get f~!(uv) = zy = f-+(u)f-1(v). 


7.54. Definition: Automorphisms of a Group. An automorphism of a group G is a 
group isomorphism f :G — G. Let Aut(G) denote the set of all such automorphisms. 


The remarks in the preceding example (taking K = P = G) show that Aut(G) is a 
subgroup of the symmetric group Sym(G), where the operation is composition of functions. 


7.55. Example: Inner Automorphisms. Let G be any group, and fix an element g € G. 
Define a map C, : G > G (called conjugation by g) by setting C,(%) = grg~' for  € G. 
The map C, is a group homomorphism, since C,(xy) = g(xy)g~* = (gxg~*)(gyg7') = 
Cy(x)Cy(y) for all x,y € G. Furthermore, Cy is a group isomorphism, since a calculation 
shows that Cj-1 is the two-sided inverse of Cy. It follows that C, € Aut(G) for every 
g € G. We call automorphisms of the form Cy inner automorphisms of G. It is possible 
for different group elements to induce the same inner automorphism of G. For example, if 
G is commutative, then C,(x) = grg~' = gg‘x = « for all g,z € G, so that all of the 
conjugation maps C, reduce to idg. 


The next theorem clarifies our initial comment that group homomorphisms “preserve 
the algebraic structure.” 


7.56. Theorem: Properties of Group Homomorphisms. Let f : G — H be a group 
homomorphism. For all n € Z and all a € G, f(a") = f(x)”. In particular, f(eg) = ey and 
f(a—') = f(x)~1. We say that f preserves powers, identities, and inverses. 


Proof. First we prove the result for all n > 0 by induction on n. When n = 0, we must 
prove that f(eg) = ex. We know egeg = eg. Applying f to both sides of this equation 
gives 

flea) flea) = flecea) = flea) = flec)ex. 
By left cancellation of f(eg) in H, we conclude that f(eq) = eq. For the induction step, 
assume n > 0 and f(x") = f(x)"; we will prove f(a"*+) = f(x)"*+. Using the definition of 
exponent notation, we calculate 


he aie =f/er@e =f{e foster: 


Next, let us prove the result when n = —1. Given x € G, apply f to the equation rz~* = eg 
to obtain 


f(x) f(a) = fea") = flea) = en = f(a) f(a). 


Left cancellation of f(x) gives f(x~!) = f(x)~!. Finally, consider an arbitrary negative 
integer n = —m, where m > 0. We have 


f(a®) = f((e*)™) = Fe ty” = (F(@)*)” = F@)™ = F(a)”. O 
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We can use group homomorphisms to construct more examples of subgroups. 


7.57. Definition: Kernel and Image of a Homomorphism. Let f : G — H bea 
group homomorphism. The kernel of f, denoted ker(f), is the set of all a € G such that 
f(x) = ex. The image of f, denoted img(f) or f[G], is the set of all y € H such that 
y = f(z) for some z €G. 


One readily verifies that ker(f) << G and img(f) < H; ie., the kernel of f is a normal 
subgroup of the domain of f, and the image of f is a subgroup of the codomain of f. 


7.58. Example. Consider the group homomorphisms h and r from Example 7.52, given 
by h(a + iy) =a and r(z) = |z| for 2,y € R and nonzero z € C. The kernel of h is the set 
of pure imaginary numbers {iy : y € R}. The kernel of r is the unit circle {z € C: |z| = 1}. 
The image of h is all of R, while the image of r is Ryo. 


7.59. Example: Even Permutations. By Theorem 7.31, the function sgn : S, — 
{+1,—1} is a group homomorphism. The kernel of this homomorphism, which is denoted 
Ay, consists of all f € S, such that sgn(f) = +1. Such f are called even permutations 
since they can be written as products of an even number of transpositions. A, is called the 
alternating group on n symbols. We show later (Corollary 7.102) that |A,,| = |.$;,|/2 = n!/2 
for all n > 2. 


The next example illustrates how group homomorphisms can reveal structural informa- 
tion about groups. 


7.60. Example: Analysis of Cyclic Subgroups. Let G be any group, written multi- 
plicatively, and fix an element « € G. Define f : Z— G by setting f(n) = x” for all n € Z. 
By the Laws of Exponents, f is a group homomorphism. The image of f is precisely (x), 
the cyclic subgroup of G generated by x. The kernel of f is some subgroup of Z, which by 
Theorem 7.40 has the form mZ for some integer m > 0. Consider the case where m = 0. 
For all i € Z, 2’ = eg iff f (i) = eg iff i € ker(f) iff i = 0, so x° is the only power of x that 
equals the identity of G. We say that x has infinite order when m = 0. In this case, for all 
i,j €Z,i#j implies x’ 4 2, since 


e=H sn J =eesoi-j=05>i=)j. 


So the sequence of group elements (...,7~?,2~', x°, z', x?,...) contains no repetitions, and 
the function f : Z — G is injective. By shrinking the codomain of f, we obtain a group 
isomorphism f’ : Z— (a). 

Now consider the case where m > 0. For all i € Z, 2’ = eg iff f(i) = eg iff i € ker(f) 
iff 2 is a multiple of m. We say that x has order m in this case; thus, the order of x is the 
least positive exponent n such that «” = eg. We claim that the cyclic group (x) consists 
of the m distinct elements x°,x',x?,...,2™~1. For, given an arbitrary element x” € (2), 
we can divide n by m to get n = mq+r for some q,r € Z with O < r < m. Then 
gt = gtr = (¢™)ia2" = eb” = 2", so x” is equal to one of the elements in our list. 
Furthermore, the listed elements are distinct. For suppose 0 <i <j < mand a’ = 2’. 
Then 2/~* = eg, forcing m to divide j — i. But 0 < 7 —i < m, so the only possibility is 
j —i =0, hence i = j. Consider the function g : Zm, + (x) given by g(i) = x* for 0 <i <m. 
This function is a well-defined bijection by the preceding remarks. Furthermore, g is a group 
homomorphism. To check this, let 7,7 € Z,,. If i+ 7 <m, then 


96 @j) =gG+9) = 2°? =a2'e? = g(i)g()). 
Ifi+j >m, then 
it+j—m 


gi Sj) =gitj-m)=« = w'e!(c™)~* = aa! = g(i)g()). 
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So g is an isomorphism from the additive group Z,, to the cyclic subgroup generated by «. 

To summarize, we have shown that every cyclic group (x) is isomorphic to one of the 
additive groups Z or Zp», for some m > 0. The first case occurs when « has infinite order, 
and the second case occurs when x has order m. 
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7.8 Group Actions 


The fundamental tool needed to solve counting problems involving symmetry is the notion 
of a group action. 


7.61. Definition: Group Actions. Suppose G is a group and X is a set. An action of 
G on X is a function *: Gx X > X satisfying the following axioms: 


1. For allg € G and allx € X, gx x € X (closure). 
2. For all a € X, eg ¥ 4 = x (identity). 
3. For all g,h € G and all x € X, g* (h* x) = (gh) * & (associativity). 


The set X with the given action * is called a G-set. 


7.62. Example. For any set X, the symmetric group G = Sym(X) acts on X via the rule 
g*a = g(x) forg € Ganda € X. Axiom 1 holds because each g € G is a function from X to 
X, hence g* x = g(x) € X for all x € X. Axiom 2 holds since eg * x = idx *4% = idx (x) =a 
for all s € X. Axiom 3 holds because 


g* (hx a) = g(h(x)) = (goh)(x) = (gh) xa for all g,h € Sym(X) and all x € X. 


7.63. Example. Let G be any group, written multiplicatively, and let X be the set G. 
Define *: Gx X > X by g* a = gx for all g,x € G. We say that “G acts on itself by 
left multiplication.” In this example, the action axioms reduce to the corresponding group 
axioms for G. 

We can define another action of G on the set X = G by letting ge x = xg~! for all 
g,x € G. The first two group action axioms are immediately verified; the third axiom follows 
by calculating (for g,h,xv € G) 


ge (hex) =ge(eh-') = (ah )g = 2(h-'g) = a(gh)! = (gh) ex. 


We say that “G acts on itself by inverted right multiplication.” It can be checked that the 
tule g-a = xg (for g,x € G) does not define a group action for non-commutative groups G, 
because Axiom 3 fails. But see the discussion of right group actions below. 


7.64. Example. Let the group G act on the set X = G as follows: 


1 for allg€ Gandxe X. 


g* r= gig” 
We say that “G acts on itself by conjugation.” The reader may verify that the axioms for 
an action are satisfied. 


7.65. Example. Suppose we are given a group action * : Gx X — X. Let H be any 
subgroup of G. By restricting the action function to the domain H x X, we obtain an 
action of H on X, as one immediately verifies. Combining this construction with previous 
examples, we obtain many additional instances of group actions. For example, any subgroup 


Groups, Permutations, and Group Actions 297 


HT of a group G acts on G by left multiplication, and by inverted right multiplication, and by 
conjugation. Any subgroup H of Sym(X) acts on X via fxx = f(x) for fe Handxe X. 
In particular, the automorphism group Aut(G) of a group G is a subgroup of Sym(G), so 
Aut(G) acts on G via fx x = f(x) for f € Aut(G) and « € G. Similarly, if K is a graph 
with vertex set X, then Aut(J‘) is a subgroup of Sym(X), and therefore Aut(J) acts on 
X via fx«x = f(x) for f € Aut(K) andae X. 


7.66. Example. Suppose X is a G-set with action «. Let P(X) be the power set of X, 
which is the set of all subsets of X. It is routine to check that P(X) is a G-set under the 


action 
geS={g*xs:sES} foralgeGandSe P(X). 


7.67. Example. Consider the set X = R[x1,22,...,2,] consisting of polynomials in 
X1,-.+,Xn With real coefficients. The symmetric group S,, acts on {1,2,...,n} via fxi = f(z) 
for f € S, andl <i <n. We can transfer this to an action of S;, on {1,...,@n} by defining 


fx*xxj=2yfy) for all f € S, andi between 1 and n. 


Each function on {21,...,%,} sending x; to wf(;) extends to a ring homomorphism Ey : 
X — X that sends a polynomial p = p(#1,...,2n) € X to Es(p) = p(xf1y,---,2f(n)) (see 
the Appendix). One may check that the rule f *p = Ey(p) (for f € 5, and p € X) defines 
an action of S,, on X. In particular, g * (h * p) = (g0 h) *p follows since both sides are the 
image of p under the unique ring homomorphism sending x; to x4 n(4)) for all 7. 


7.68. Example. By imitating ideas in the previous example, we can define certain group 
actions on vector spaces. Suppose V is a real vector space and X = (#1,...,2») is an ordered 
basis of V. For f € S;,, the map sending each x; to x(;) uniquely extends by linearity to a 
linear map Ty : V — V, given explicitly by 


Ty(a121 +--+ +Gn0n) = ait 71) +-** + Gn%em) for all a),..., an € R. 


One may check that f *v = T;(v) (for f € S, and v € V) defines an action of the group 
S,, on the set V. 


7.69. Example. Suppose G is a group, X is a G-set with action *, and W and Z are any 
sets. Let Fun(W, X) denote the set of all functions F': W — X. This set of functions can 
be turned into a G-set by defining 


(ge F)(w)=g*(F(w)) forallg eG, F ¢ Fin(W,X), andwe Ww, 


as is readily verified. 
Now consider the set Fun(X, Z) of all functions F : X — Z. We claim this set of 
functions becomes a G-set if we define 


(ge F)(x)=F(g-'*z) forallgeG, Fe Fun(X,Z), andre X. 


Let us carefully prove this claim. First, given g € G and F € Fun(X,Z), the map ge F 
is a well-defined function from X to Z because g~! * x € X and F maps X into Z. So, 
geF © Fun(X, Z), verifying closure. Second, to prove that ee F = F for all F € Fun(X, Z), 
we fix « € X and check that (ee F’)(a) = F(a). We calculate 


(eo F\(x) = F(e7' * 2) = F(e* 2) = F(z). 


Third, we verify the associativity axiom for e. Fix g,h € G and F © Fun(X, Z). The two 
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functions g e (he F’) and (gh) e F both have domain X and codomain Z. Fix « € X. On 
one hand, 
[(gh) © F(x) = F((gh)~* * 2) = F((h~"g"") * 2). 


On the other hand, using the definition of e twice, 
[ge (he F)\(x) = [he F\(g-! x2) = F(h! « (g7! *2)). 


Since * is known to be an action, we see that ge (he F’) and (gh) e F have the same value 
at x. So the third action axiom is proved. One may check that this axiom would fail, in 
general, if we omitted the inverse in the definition of e. 


7.70. Example. Let n be a fixed integer, let Y be a set, and let 
U=Y"={(y1,---,Yn): 4 € YF 
be the set of all sequences of n elements of Y. The group S;, acts on U via the rule 


f : (Y1, Y2; ee Un) = (Yp—1(1)s Yf-2(2)) ane 4 Yf-1(n)) for all f E fon and all Y1y+++9 Un EY. 


The inverses in this formula are essential. To see why, we observe that the action here is in 
fact a special case of the previous example. This is true because a sequence in U is formally 
defined to be a function y: {1,2,...,n} > Y where y(i) = y;. Using this function notation 
for sequences, we have (for all f € S;, and all i between 1 and n) 


(f- N@ =f") = (Fey, 


in agreement with the previous example. We can also say that acting by f moves the 
object z originally in position i to position f(i) in the new sequence. This is true because 
(f -w(F@) = ¥F-2@)) = yl) = z. 

The reader may now be disturbed by the lack of inverses in the formula f * xj = x ¢(;) 
from Example 7.68. However, that situation is different since 71,...,2, in that example 
are fixed basis elements in a vector space V, not the entries in a sequence. Indeed, recall 
that the action on V is given by f * v = Ty(v) where Ty is the linear extension of the map 
sending x; to rf(j). Writing v = YL a;x;, the coordinates of v relative to this basis are 
the entries in the sequence (a1, @2,...,@n) € R”. Applying f to v gives 


n n 
fava) lagu =) ) aps, 
i=1 j=l 
after changing variables by letting 7 = f(i), i= f~+(j). We now see that the coordinates 
of f * v relative to the ordered basis (21,...,%m) are (af-1(1),---,@f-1(m))- For example, 
(1, 2,3) * (a1@1 + ax + 4323) = a1X2 + a2%3 + A321 = A321 + a1 X2 + aoe, 

or equivalently, in coordinate notation, 

(1, 2,3) * (a1, a2, a3) = (a3, a1, a2). 


To summarize, when f acts directly on the objects x;, no inverse is needed; but when f 
permutes the positions in a list, one must apply f~! to each subscript. 


7.71. Remark: Right Actions. A right action of a group G on a set X is a function 
* :X x G— X such that « *e = xv and x * (gh) = (a * g) *h for all « € X and all 
g,h © G. For example, x * g = xg (with no inverse) defines a right action of a group G on 
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the set X = G. Similarly, we get a right action of S;, on the set of sequences in the previous 
example by writing 
Wiese) * Fe = Wray rte) 


Group actions (as defined at the beginning of this section) are sometimes called left actions 
to avoid confusion with right actions. We shall mostly consider left group actions, but right 
actions are occasionally more convenient to use (see, for example, Definition 7.90). 
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7.9 Permutation Representations 


Group actions are closely related to symmetric groups. To understand the precise nature of 
this relationship, we need the following definition. 


7.72. Definition: Permutation Representations. A permutation representation of a 
group G on a set X is a group homomorphism ¢: G + Sym(X). 


This definition seems quite different from the definition of a group action given in the 
last section. But we show in this section that group actions and permutation representations 
are essentially the same thing. Both viewpoints turn out to be pertinent in the application 
of group actions to problems in combinatorics and algebra. 

We first show that any group action of G on X can be used to construct an associated 
permutation representation of G on X. The key idea appears in the next definition. 


7.73. Definition: Left Multiplication Maps. Let « : Gx X — X be an action of the 
group G on the set X. For each g € G, left multiplication by G (relative to this action) is 
the function Ly : X — X defined by 


L(t) =g*au forallac X. 


Note that the outputs of L, are members of the codomain X, by the closure axiom for 
group actions. 


7.74. Theorem: Properties of Left Multiplication Maps. Let a group G act on a set 
X. (a) De = idx. (b) For all g,h € G, Lon = Lg 0 Lp. (c) For all g € G, Lg € Sym(X), and 


| ey = Ly. 


Proof. All functions appearing here have domain X and codomain X. So it suffices to check 
that the relevant functions take the same value at each x € X. For (a), LD. (a) =exxa =a = 
idx (x) by the identity axiom for group actions. For (b), Lgn(x) = (gh) *v =g*(h* az) = 
Lg(Ln(x)) = (Lg° Ln) (x) by the associativity axiom for group actions. Finally, using (a) and 
(b) with h = g~ shows that idx = Lg o L,-1. Similarly, idx = L,-1 0 Lg. This means that 


L,-1 is the two-sided inverse of Lg; in particular, both of these maps must be bijections. O 


Using the theorem, we can pass from a group action * to a permutation representation 
@ as follows. Define ¢: G— Sym(X) by setting ¢(g) = L, for all g € G. We just saw that 
@ does map into the codomain Sym(X), and 


O(gh) = Lon = Lgo Ln =¢(g)0¢(h)  forallg,heG. 


So ¢ is a group homomorphism. 
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7.75. Example: Cayley’s Theorem. We know that any group G acts on the set X =G 
by left multiplication. The preceding construction produces a group homomorphism ¢ : 
G — Sym(G) such that ¢(g) = L, for all g € G, and L,(x) = gz for all x € G. We 
claim that ¢ is injective in this situation. For, suppose g,h € G and L, = Ly. Applying 
these two functions to e (the identity of G) gives L,(e) = Ln(e), so ge = he, so g = Ah. It 
follows that G is isomorphic (via ¢) to the image of ¢, which is a subgroup of the symmetric 
group Sym(G). This proves Cayley’s Theorem, which says that any group is isomorphic to 
a subgroup of some symmetric group. If G has n elements, one can check that Sym(G) is 
isomorphic to S,,. So every n-element group is isomorphic to a subgroup of the symmetric 
group Sy. 


7.76. Example. For any set X, we know Sym(X) acts on X via fxa = f(x) for f € Sym(X) 
and « € X. What is the associated permutation representation ¢ : Sym(X) > Sym(X)? 
First note that for f € Sym(X), left multiplication by f is the map Ly : X — X such that 
Ly(x) = f xx = f(x). In other words, Ly = f, so that ¢(f) = Ly = f. This means that ¢ 
is the identity homomorphism. More generally, whenever a subgroup H of Sym(X) acts on 
X via hx x = h(x), the corresponding permutation representation is the inclusion map of 
H into Sym(X). 


So far, we have seen that every group action of G on X has an associated permuta- 
tion representation. We can reverse this process by starting with an arbitrary permutation 
representation ¢ : G + Sym(X) and building a group action, as follows. Given ¢, define 
*: Gx X + X by setting g * « = ¢(g)(z) for all g € G and x € X. Note that ¢(g) isa 
function with domain X, so the expression ¢(g)(x) denotes a well-defined element of X. In 
particular, *« satisfies the closure axiom for an action. Since group homomorphisms preserve 
identities, ¢(e) = idx, and so ex x = ¢(e)(x) = idx (a) = « for all a € X. So the identity 
axiom holds. Finally, using the fact that ¢ is a group homomorphism, we calculate 


(gh)*z = (gh)(x) = (6(g) o o(h))(2) 
d(g)(o(h)(x)) =g*(h*x) forallg,héGandallxe X. 


So the associativity axiom holds, completing the proof that * is a group action. 
The following theorem is the formal enunciation of our earlier claim that group actions 
and permutation representations are essentially the same thing. 


7.77. Theorem: Equivalence of Group Actions and Permutation Representa- 
tions. Fix a group G and a set X. Let A be the set of all group actions of G on X, and 
let P be the set of all permutation representations of G on X. There are mutually inverse 
bijections F': A> P and H : P > A, given by 


F(*) =¢@:G— Sym(X) where $(g) = L, and L,(x) =g* 2 for alla e X andg EG; 
H(¢) =*:GxX > X where gx x = d(g)(z) for allge€ Gandae X. 


Proof. The discussion preceding the theorem shows that F' maps the set A into the codomain 
P, and that H maps the set P into the codomain A. We must verify that Fo H = idp and 
HoF=idg. 

To show F'o H = idp, fix ¢ € P, and write * = H(¢) and # = F(x). We must confirm 
the equality of w and ¢, which are both functions from G into Sym(X). To do this, fix 
g € G, and ask whether the two functions w(g), é(g) : X + X are equal. For each x € X, 


v(g)(z) = Lg(x) = 9 * x = $(g)(z). 
So w(g) = ¢(g) for all g, hence w = ¢ as needed. 
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To show H o F = idy, fix * € A, and write ¢ = F(*) and e = H(¢). We must confirm 
the equality of e and *, which are both functions from G x X into X. For this, fix g € G 
and x € X. Now compute 


ger = $(g)(x) = Ly(x) = gx. O 


7.78. Example. We can use permutation representations to generate new constructions of 
group actions. For instance, suppose X is a G-set with action * and associated permutation 
representation ¢ : G > Sym(X). Now suppose we are given a group homomorphism u : K > 
G. Composing with ¢ gives a homomorphism ¢0 u: K + Sym(X). This is a permutation 
representation of kK on X, which leads to an action of kK on X. Specifically, by applying 
the map H from the theorem, we see that the new action is given by 


kex=u(k)*xa forallke K andre x. 
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7.10 Stable Subsets and Orbits 


One way to gain information about a group is to study its subgroups. The analogous concept 
for G-sets appears in the next definition. 


7.79. Definition: G-Stable Subsets. Let a group G act on a set X. A subset Y of X is 
called a G-stable subset iff g*y€ Y for allge GandallycY. 


When Y is a G-stable subset, the restriction of * to G x Y maps into the codomain Y, 
by definition. Since the identity axiom and associativity axiom still hold for the restricted 
action, we see that Y is a G-set. 

Recall that every element of a group generates a cyclic subgroup. Similarly, we can pass 
from an element of a G-set to a G-stable subset as follows. 


7.80. Definition: Orbits. Suppose a group G acts on a set X, and x € X. The orbit of 
x under this action is the set 


Ga=Greae={gxu: ge G}Cx. 


Every orbit is a G-stable subset: for, given h € G and g*x € Ga, the associativity axiom 
gives h * (gx x) = (hg) xa € Ga. Furthermore, by the identity axiom, « = e x x € Gx for 
each « € X. The orbit Gz is the smallest G-stable subset of X containing wx. 


7.81. Example. Let Ss act on the set X = {1,2,3,4,5} via f* a = f(x) for f © Ss and 
x € X. For each i € X, the orbit S5 «i = {f(i): f © Ss} is all of X. The reason is that for 
any given j in X, we can find an f € Ss such that f(i) = 7; for instance, take f = (7,7). 
On the other hand, consider the subgroup H = ((1,3)(2,4,5)) of Ss. If we let H act on X 
via f xa = f(x) for f € H anda € X, we get different orbits. It can be checked that 


Hel=H*3=(1,3}, H*2=H*x4=H«5 = {2,45}. 


Note that the H-orbits are precisely the connected components of the digraph representing 
the generator (1,3)(2,4,5) of H. This observation holds more generally whenever a cyclic 
subgroup of S;, acts on {1,2,...,n}. 

Now consider the action of A; on X. As in the case of Ss, we have As *i = X for all 
i € X, but for a different reason. Given 7 € X, we must now find an even permutation 
sending i to 7. We can use the identity permutation if 7 = 7. Otherwise, choose two distinct 
elements k,! that are different from i and j, and use the permutation (7, 7)(K, 1). 
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7.82. Example. Let S4 act on the set X of all 4-tuples of integers by permuting positions: 
fx (11,2, U3, LA) = (© p-1(1), £f-1(2)) ¥f-1(3)) Lf-1(4)) for all f € S4 and a; € Z. 


The S4-orbit of a sequence x = (#1, 22,23, 24) consists of all possible sequences reachable 
from x by permuting the entries. For example, 


S4 * (5,1,5,1) = {(1, 1,5, 5), (1,5, 1,5), 1,5, 5,1), (5,1, 1,5), (5, 1,5, 1), (5,5, 1, 1)}. 


As another example, $4 * (3,3,3,3) = {(3,3,3,3)} and Sy * (1,3,5,7) is the set of all 
24 permutations of 1,3,5,7. Now consider the cyclic subgroup H = ((1,2,3,4)) of Sy. 
Restricting the action turns X into an H-set. When computing orbits relative to the action 
of H, we are only allowed to cyclically shift the elements in each 4-tuple. So, for instance, 


H x (5,1,5,1) = ACOs Dell). (1,5, 1,5)}; 
H«(1,3,5,7) = {(1,3,5,7), (3,5,7,1), (5,7,1,3), (7,1,3,5)}. 


We see that the orbit of a given 2 € X depends heavily on which group is acting on X. 


7.83. Example. Let a group G act on itself by left multiplication: g * x = gx for g,x € G. 
For every x € G, the orbit Gz is all of G. For, given any y € G, we have (yx!) * a =y. In 
the next section, we study what happens when a subgroup H acts on G by left (or right) 
multiplication. 


7.84. Example: Conjugacy Classes. Let G be a group. We have seen that G acts on 
itself by conjugation: g * « = gxg~! for g,x € G. The orbit of x € G under this action is 
the set 

Gua ={grg*: 9 € Gh. 


This set is called the conjugacy class of x in G. For example, when G = $3, the conjugacy 
classes are: 
Grid = {id}; 
G * (1,2) = Gx (1,3) = G* (2,3) {(1, 2), (1, 3), (2, 3)}; 
G « (1,2,3) =G@*(1,3,2) = {(1,2,3), (1,3, 2)}. 


One can confirm this with the aid of the identities 


folij)of  =F@FM); foGikof = (FH, FG), F(A) for all f € Ss. 


The generalization of this example to any S,, is discussed in §7.13. We observe in passing 
that G «a = {x} iff grg~! = « for all g € G iff gz = 2@ for all g € G iff x commutes with 
every element of G. In particular, for G commutative, every conjugacy class consists of a 
single element. 


7.85. Example. Let B = (X, E) be the graph representing a 5 x 5 chessboard shown in 
Figure 7.3. Let the graph automorphism group G = Aut(B) act on X via f * 2 = f(x) for 
f © Gand a € X. We explicitly determined the elements of G in Example 7.48. We can 
use this calculation to find all the distinct orbits of the action. They are: 
Ga = {a,e,z,v} = Ge = Gz = Gu; 
Gb {b,d,j,u,y,w,g, fs} =Gd=Gj=---; 
Ge = {c,p,x,k}; 
Gg {9,7,t,7}; 
Gh = {h,n,s,]}; 
Gm = {m}. 
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The reader may have noticed in these examples that distinct orbits of the group action 
are always pairwise disjoint subsets of X. We now prove that this always happens. 


7.86. Theorem: Orbit Decomposition of a G-set. Let X be a G-set. Every element 
x € X belongs to exactly one orbit, namely Gx. In other words, the distinct orbits of the 
action of G on X form a set partition of X. 


Proof. Define a relation on X by setting, for all z,y © X, a2 ~ y iff y = g * x for some 
g € G. The relation ~ is reflexive on X: given x € X, we have x =ex*2z,so x ~ x. The 
relation ~ is symmetric: given x,y € X with x ~ y, we know y = g * x for some g € G. 
A routine calculation shows that 2 = g~! * y, so y ~ x. The relation ~ is transitive: given 
z,y,z€ X witha ~y andy ~ z, we know y=g*a2 and z=h*xy for some g,h € G. So 
z= hx(g*x) = (hg)*x, and x ~ z. Thus ~ is an equivalence relation on X. Recall from the 
proof of Theorem 2.51 that the equivalence classes of any equivalence relation on X form a 
set partition of X. In this situation, the equivalence classes are precisely the G-orbits, since 
the equivalence class of x is 


fyeX:anyh={yeX:y=g*-x for some g € G} =Gzu. oO 
7.87. Corollary. Every group G is the disjoint union of its conjugacy classes. 


Everything we have said can be adapted to give results on right actions. In particular, 
if G acts on X on the right, then X is partitioned into a disjoint union of the right orbits 


wG={r«g:geEGh where x € X. 
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7.11 Cosets 


The concept of cosets plays a central role in group theory. Cosets appear as the orbits of a 
certain group action. 


7.88. Definition: Right Cosets. Let G be a group, and let H be any subgroup of G. 
Let H act on G by left multiplication: h * x = hx for h € H and a € G. The orbit of x 
under this action, namely 

Ha = {hx:he H} 


is called the right coset of H determined by x. 


By the general theory of group actions, we know that G is the disjoint union of its right 
cosets. 


7.89. Example. Let G = S3 and H = {id, (1,2)}. The right cosets of H in G are 
Hid=H(1,2) = {id,(1,2)} =H; 
H(1,3) = H(1,3,2) {(1,3), (1,3, 2)}: 
A(2,3) = H(1,2,3) = {(2,3),(1,2,3)}. 


I 


For the subgroup K = {id, (1, 2,3), (1,3,2)}, the distinct right cosets are 
Kid = K and K(1,2) = {(1,2), (2,3), (1,3)}. 


Note that the subgroup itself is always a right coset, but the other right cosets are not 
subgroups (they do not contain the identity of G). 
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By letting H act on the right, we obtain the notion of a left coset, which will be used 
frequently below. 


7.90. Definition: Left Cosets. Let G be a group, and let H be any subgroup of G. Let 
HA act on G by right multiplication: x * h = xh for h © H and x € G. The orbit of x under 
this action, namely 

vH = {xh:he H} 


is called the left coset of H determined by x. 
By Theorem 7.86, G is the disjoint union of its left cosets. 


7.91. Example. Let G = S3 and H = {id, (1,2)} as above. The left cosets of H in G are 


idH=(1,2)H = {id,(1,2)} =H; 
(1,3)H =(1,2,3)H = {(1,3),(1,2,3)}; 
(2,3)H =(1,3,2)H = {(2,3),(1,3,2)}. 


In this example, cH 4 Ha except when x € H. This shows that left cosets and right cosets 
do not coincide in general. On the other hand, for the subgroup K = {id, (1, 2,3), (1,3,2)}, 
the left cosets are K and (1,2)K = {(1, 2), (1,3), (2,3)}. One checks that «kK = K- for all 


xz € Ss, so that left cosets and right cosets do coincide for some subgroups. 


Although x = y certainly implies cH = yH, one must remember that the converse is 
almost always false. The next result gives criteria for deciding when two cosets cH and yH 
are equal; it is used constantly in arguments involving cosets. 


7.92. The Coset Equality Theorem. Let H be a subgroup of G. For all x,y € G, the 
following conditions are logically equivalent: 


(a) cH = yH (a’) yd = «cH 

(b) a €yH (b’) ye xH 

(c) She H,x=yh (c') Sh! € Hyy=ch’ 
(d)y ‘xe H (d')a-'yeH 


Proof. We first prove (a)=>(b)=(c)=(d)=(a). If eH = yH, then x = xe € «H = yH, so 
x € yH. If « € yH, then x = yh for some h € H by definition of yH. If « = yh for some 
h € H, then multiplying by y~! on the left gives y~!x € H. Finally, assume that y~!w € H. 
Then y(y~tx) = x lies in the orbit yH. We also have x = xe € rH. As orbits are either 
disjoint or equal, we must have cH = yH. 

Interchanging x and y in the last paragraph proves the equivalence of (a’), (b’), (c’), 
and (d’). Since (a) and (a’) are visibly equivalent, the proof is complete. O 


7.93. Remark. The equivalence of (a) and (d) in the last theorem is used quite frequently. 
Note too that the subgroup H is a coset (namely eH), and «H = AH iff cH = eH iff 
eta € H iff x € H. Finally, one can prove an analogous theorem for right cosets. The key 
difference is that Hx = Hy iff ry~! € H iff yx~! € A (so that inverses occur on the right 
for right cosets). 


We can use cosets to construct more examples of G-sets. 


7.94. Example: The G-set G/H. Let G be a group, and let H be any subgroup of G. 
Let G/H be the set of all distinct left cosets of H in G. Every element of G/H is a subset 
of G of the form «H = {xh : h © H} for some x € G (a is not unique when |H| > 1). So, 
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G/H is a subset of the power set P(G). Let the group G act on the set X = G/H by left 
multiplication: 
gxS={gs:se€S} foralgeGandSeéG/H. 


This action is the restriction of the action from Example 7.66 to the domain G x X. To see 
that the action makes sense, we must check that X is a G-stable subset of P(G). Let cH 
be an element of X and let g € G; then 


g* (tH) = {g(ah): he H}={(gr)h: he H}=(gxr)H EX. 


Let [|G : H] = |G/H]|, which may be infinite; this cardinal number is called the index of H 
in G. Lagrange’s Theorem (proved below) shows that |G/H| = |G|/|H| when G is finite. 


7.95. Remark. Using the Coset Equality Theorem, it can be shown that the map sending 
each left coset 2H to the right coset Hx! gives a well-defined bijection between G/H and 
the set of right cosets of H in G. So, we would obtain the same number [G : H] if we had 
used right cosets in the definition of G/H. It is more convenient to use left cosets here, so 
that G can act on G/H on the left. 


7.96. Example. If G = S3 and H = {id, (1,2)}, then 
G/H = {{id, (1, 2)}, {(1, 3), (1, 2, 3)}, {(2, 3), (1,3, 2)}} = {id H, (1,3) H, (2,3) H}. 


We have [G : H] = |G/H| = 3. Note that |G|/|H| = 6/2 = 3 = |G/H]. This is a special 
case of Lagrange’s Theorem, proved below. 


To prepare for Lagrange’s Theorem, we first show that every left coset of H in G has 
the same cardinality as H. 


7.97. The Coset Size Theorem. Let H be a subgroup of G. For all x € G, |tH| = |H]. 


Proof. We have seen that the left multiplication map L, : G — G, given by L.(g) = xg 
for g € G, is a bijection with inverse L,-1. Restricting the domain of L, to H gives an 
injective map L!,: H — G. The image of this map is {rh : h € H} = xH. So, restricting 
the codomain gives a bijection from H to «H. Thus, the sets H and «H have the same 
cardinality. O 


7.98. Lagrange’s Theorem. Let H be any subgroup of a finite group G. Then 
[G : H] -|H| = |G]. 
So |H| and [G : H] are divisors of |G|, and |G/H| = [G: H] = |G|/|H]. 


Proof. We know that G is the disjoint union of its distinct left cosets: G = Usecysx S. By 
the Coset Size Theorem, |S| = |H| for every S € G/H. So, by the Sum Rule, 


IGj= S° |S|= S° [Al =|G/H|-|A| =(G: A] -|Al. Oo 
SEG/H SEG/H 


7.99. Remark. The equality of cardinal numbers |H|-[G : H] = |G| holds even when G is 
infinite, with the same proof. 


7.100. Theorem: Order of Group Elements. If G is a finite group of size n and x € G, 
then the order of x is a divisor of n, and x7” = eg. 


Proof. Consider the subgroup H = (a) generated by x. The order d of x is |H|, which divides 
|G| = n by Lagrange’s Theorem. Writing n = cd, we see that x” = (x4)° = e& =e. oO 
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The next result gives an interpretation for cosets «kK in the case where K is the kernel 
of a group homomorphism. 


7.101. Theorem: Cosets of the Kernel of a Homomorphism. Let f :G— L bea 
group homomorphism with kernel K. For every x € G, 

ck = {yeG: fly) = f(a)} = Ke. 
If G is finite and I is the image of f, it follows that |G] = |K|- |Z]. 
Proof. Fix x € G, and set S={y eG: f(y) = f(x)}. We will prove that xk = S. First 
suppose y € «Kk, so y = xk for some k € K. Applying f, we find that f(y) = f(ak) = 
f(a)f(k) = f(ajex = f(x), so y € S. Next suppose y € S, so f(y) = f(x). Note that 


f(x—ty) = f(a)" fly) =e, so x+y € ker(f) = K. So y = x(x~1y) € «XK. The proof that 
S = Kz is analogous. To obtain the formula for |G], note that G is the disjoint union 


G=(JyeG: fy) = 2}. 
zel 


Every z € I has the form z = f(x) for some x € G. So, by what we have just proved, each 
set appearing in the union is a coset of K, which has the same cardinality as K. So the 
Sum Rule gives |G| = 0,-,|K| = |K|-|J|. Oo 


7.102. Corollary: Size of A,,. For n > 2, |A,| = n!/2. 


Proof. We know that sgn : S, — {1,—1} is a surjective group homomorphism with kernel 
Ay. So n! = |S;,| = |An| - 2. Oo 


7.12 The Size of an Orbit 


In Theorem 7.86, we saw that every G-set X breaks up into a disjoint union of orbits. This 
result leads to two combinatorial questions. First, given x € X, what is the size of the orbit 
Ga? Second, how many orbits are there? We answer the first question here; the second 
question will be solved in §7.15. 

The key to computing the orbit size |Gz| is to relate the G-set Ga to one of the special 
G-sets G/H defined in Example 7.94. For this purpose, we need to associate a subgroup H 
of G to the given orbit Ga. 


7.103. Definition: Stabilizers. Let X be a G-set. For each « € X, the stabilizer of x in 
G is 

Stab(z) ={gE G:gerx=ah. 
Sometimes the notation G is used to denote Stab(z). 


We now show that Stab(x) is a subgroup of G for each x € X. First, since e* 2 = 2, 
e € Stab(x). Second, given g,h € Stab(az), we know gx a = 4% = hx a, so (gh) *a = 
g*(h*xx) = 2, so gh € Stab(z). Third, given g € Stab(x), we know gxx =2,sor=g '*a, 


so g~! € Stab(z). 


7.104. Example. Let $,, act on X = {1,2,...,n} via fxa = f(x) for fe S, andxe X. 
The stabilizer of a point i € X consists of all permutations of X for which 2 is a fixed point. 
In particular, Stab(n) consists of all bijections f : X + X with f(n) = n. Restricting the 
domain and codomain to {1,2,...,n — 1} defines a group isomorphism between Stab(n) 
and Sn—-1- 
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7.105. Example. Let a group G act on itself by left multiplication. Right cancellation of x 
shows that gx = x iff g = e. Therefore, Stab(x) = {e} for all « € G. At the other extreme, 
we can let G act on any set X by declaring g * x = x for all g € G and all x € X. Relative 
to this action, Stab(x) = G for all xe X. 


7.106. Example: Centralizers. Let G act on itself by conjugation: g «2 = grg~' for all 
g,x € G. For a given x € G, g € Stab(z) iff grg~! = x iff gx = xg iff g commutes with z. 
This stabilizer subgroup is often denoted C¢(x) and called the centralizer of x in G. The 
intersection (),,~g Cg(x) consists of all g € G that commute with every x € G. This is a 
subgroup called the center of G and denoted Z(G). 


7.107. Example: Normalizers. Let G be a group, and let X be the set of all subgroups 
of G. G acts on X by conjugation: gx H = gHg~! = {ghg-!:h € H}. (Note that g*H is a 
subgroup, since it is the image of a subgroup under the inner automorphism of conjugation 
by g; see Example 7.55.) For this action, g € Stab(H) iff gHg~'! = H. This stabilizer 
subgroup is denoted Ng(H) and called the normalizer of H in G. One may check that 
Ne(f) always contains H. 


7.108. Example. Let S4 act on 4-tuples of integers by permuting the positions. Then 
Stab((5,1,5,1)) = {id, (1,3), (2,4), (1,3)(2, 4)}; Stab((2,2,2,2)) = $4; Stab((1,2,3,4)) = 
{id}; and Stab((2, 5,2, 2)) is a subgroup of $4 isomorphic to Sym({1, 3, 4}), which is in turn 
isomorphic to S3. 


The following fundamental theorem calculates the size of an orbit of a group action. 


7.109. The Orbit Size Theorem. Assume G is a group and X is a G-set. For each 
x € X, there is a bijection f : G/Stab(a) > Ga given by f(g Stab(«)) = g* for allg € G. 
So, when G is finite, the size of the orbit of x is the index of the stabilizer of x, which is a 
divisor of |G]: 

|Gz| = [G: Stab(x)] = |G|/| Stab(x)|. 


Proof. Fix « € X, and write H = Stab(x) for brevity. We first check that the function 
f :G/H > Gz given by f(gH) = 9 * x is well-defined. Assume g,k € G satisfy gH = kH; 
we must check that g *« = k* x. By the Coset Equality Theorem, gH = kH means 
kg € H = Stab(zx), and hence (k~+g) «x = x. Acting on both sides by k and simplifying, 
we obtain g * x = k * x. Second, is f one-to-one? Fix g,k € G with f(gH) = f(kH); we 
must prove gH = kH. Now, f(gH) = f(kH) means g *x =k «ax. Acting on both sides by 
k~1, we find that (k~1g)*2 =a, so k~'g € H, so gH = kH. Third, is f surjective? Given 
y € Ga, the definition of Gx says that y = g*« for some g € G, so y = f(gH). In summary, 
f is a well-defined bijection. O 


7.110. Remark. One can prove a stronger version of the theorem, analogous to the Funda- 
mental Homomorphism Theorem for Groups (see Exercise 7-57) by introducing the following 
definition. Given G-sets X and Y with respective actions * and e, a G-map is a function 
p:X —+Y such that p(g* x) = ge p(x) for all g € Gand all a € X. A G-isomorphism is a 
bijective G-map. The theorem above gives us a bijection p = f—! from the G-set Gx to the 
G-set G/ Stab(a) such that p(go*x) = go Stab(«). This bijection is in fact a G-isomorphism, 
because 


P(g * (go * x)) = p((ggo) * x) = (ggo) Stab(x) = g © (go Stab(x)) = g ¢ p(go * x). 


Since every G-set is a disjoint union of orbits, this result shows that the special G-sets of 
the form G/H are the building blocks from which all G-sets are constructed. 


Applying Theorem 7.109 to some of the preceding examples gives the following corollary. 
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7.111. Corollary: Counting Conjugates of Group Elements and Subgroups. The 
size of the conjugacy class of x in a finite group G is [G : Stab(x)] = [G : Ce(x)] = 
|G|/|Cg(x)|. If H is a subgroup of G, the number of distinct conjugates of H (i.e., subgroups 
of the form gHg~*) is [G : Stab(H)] = [G: Nc(A)] = |G|/|Nc(A)|. 


DT 


7.13 Conjugacy Classes in S,, 


The conjugacy classes in the symmetric groups S;, can be described explicitly. We shall 
prove that the conjugacy class of f € S;, consists of all g € S, with the same cycle type 
as f (see Definition 7.20). The proof employs the following result showing how to compute 
conjugates of permutations written in cycle notation. 


7.112. Theorem: Conjugation in S,,. For all f,g € S,, the permutation gfg~! is ob- 


tained by applying g to each entry in the disjoint cycle decomposition of f. More specifically, 
if f is written in cycle notation as 


f = (44, 12, 73, oe (fry 92; avis (ka, ko, ee ) erases 
then 
gf g-* = (g(i1), 9(é2), (és), ---)(G(G1), 9a), ---)(g(F1), (kz), ++) 
In particular, type(gfg—!) = type(f). 


Proof. First assume f is a k-cycle, say f = (i1,%2,...,i,). We prove that the functions 
gfg-* and h = (g(i1), g(iz2),..-,g(in)) are equal by showing that both have the same effect 
on every x € {1,2,...,n}. We consider various cases. First, if 2 = g(i;) for some s < k, 
then gfg~*(x) = gfg~*(g(is)) = 9(f(és)) = g(is+1) = A(z). Second, if x = g(ix), then 
gfg-*\(x) = g(f(ix)) = g(i1) = h(a). Finally, if x does not equal any g(i,), then g~(zx) 
does not equal any i,. So f fixes g~+(x), and gfg~*(x) = g(g-'(x)) =a = h(a). 

In the general case, write f = CoC 20---oC; where each C; is a cycle. Since conjugation 
by g is a homomorphism, 


gfg~* = (gCig~*) 0 (gC2g*) 0+++ 0 (gCxg™"). 


By the previous paragraph, we can compute gC;g~' by applying g to each element appearing 
in the cycle notation for C;. Since this holds for each 7, gfg~! can be computed by the same 
process. O 


7.113. Theorem: Conjugacy Classes of S,,. The conjugacy class of f € S;, consists of 
allh € S,, with type(h) = type(f). The number of conjugacy classes is p(n), the number of 
integer partitions of n. 


Proof. Fix f € Sn; let T ={gfg-':g € Sn} be the conjugacy class of f, and let U = {he 
S;, : type(h) = type(f)}. Using Theorem 7.112, we see that T C U. For the reverse inclusion, 
let h € S;, have the same cycle type as f. We give an algorithm for finding a g € S,, such 
that h = gfg—'. First write any complete cycle notation for f (including 1-cycles), writing 
longer cycles before shorter cycles. Immediately below this, write a complete cycle notation 
for h. Now erase all the parentheses and regard the resulting array as the two-line form of 
a permutation g. Theorem 7.112 now shows that gfg~! = h. For example, suppose 


f = (1,7, 3)(2, 8, 9)(4, 5)(6); 
h = (4,9,2)(6,3,5)(1,8)(7). 
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Then 


By the very definition of g, applying g to each symbol in the chosen cycle notation for f 
produces the chosen cycle notation for h. The g constructed here is not unique; we could 
obtain other permutations g satisfying gfg~! = h by starting with different complete cycle 
notations for f and h. 

The last statement of the theorem follows since the possible cycle types of permutations 
of n objects are exactly the integer partitions of n (weakly decreasing sequences of positive 
integers that sum to n). O 


We now apply Corollary 7.111 to determine the sizes of the conjugacy classes of S,. 


7.114. Definition: z,. Let 4 be an integer partition of n consisting of a; ones, az twos, 
etc. Define 
Zp = 112% --- nr ay!ag!---ay!. 


For example, for w = (3,3,2,2,2,2,1,1,1,1,1), we have a; = 5, ag = 4, a3 = 2, and 
Zp, = 15243251412! = 829,440. 


7.115. Theorem: Size of Conjugacy Classes of S,. Given an integer partition jz of 
n, the number of permutations f € S,, with type(f) = pw is n!/z,. 


Proof. Fix a particular f € S, with type(f) = yw. By Corollary 7.111 and the fact that 
|S;,| = n!, it is enough to show that |Cs,,(f)| = z,. We illustrate the reasoning through a 
specific example. Let = (3,3, 2,2,2,2,1,1,1,1,1), and take 


f = (1, 2,3)(4, 5, 6)(7, 8)(9, 10) (11, 12) (13, 14)(15) (16) (17) (18) (19). 


A permutation g € S,, lies in Cg, (f) iff gfg~! = f iff applying g to each symbol in the 
given cycle notation for f produces another cycle notation for f. So we need only count 
the number of ways of writing a complete cycle notation for f such that longer cycles come 
before shorter cycles. Note that we have freedom to rearrange the order of all cycles of a 
given length, and we also have freedom to cyclically permute the entries in any given cycle 
of f. For example, we could permute the five 1-cycles of f in any of 5! ways; we could 
replace (4,5,6) by one of the three cyclic shifts (4,5,6) or (5,6,4) or (6,4,5); and so on. 
For this particular f, the Product Rule gives 2!4!5!37241° = z,, different possible complete 
cycle notations for f. The proof of the general case is the same: the term a;! in z, arises 
when we choose a permutation of the a; cycles of length 7, while the term 7“ arises when 
we choose one of 7 possible cyclic shifts for each of the a; cycles of length 7. Multiplying 
these contributions gives z,,, as needed. O 


DS 


7.14 Applications of the Orbit Size Formula 


When a finite group G acts on a finite set X, the Orbit Size Theorem 7.109 asserts that 
the size of the orbit Gx is |G|/| Stab(a)|, which is a divisor of |G|. We now use this fact to 
establish several famous theorems from algebra, number theory, and combinatorics. Recall 
that for a,b,p € Z, a= b (mod p) means that a — b is an integer multiple of p. For fixed p, 
= is an equivalence relation on Z. 
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7.116. Fermat’s Little Theorem. For every integer a > 0 and every prime p, a? = a 
(mod p). 


Proof. Let Y = {1,2,...,a}, and let X = Y? be the set of all p-tuples (y1,...,yp) of 
elements of Y. By the Product Rule, |X| = a?. We know that S, acts on X by permuting 
positions (see Example 7.70). Let H = ((1,2,...,p)), which is a cyclic subgroup of 5, of 
size p. Restricting the action to H, we see that H acts on X by cyclically shifting positions. 
The only divisors of the prime p are 1 and p, so all orbits of X under the action of H have 
size 1 or p. Now, w = (y1, y2,---,Yp) is in an orbit of size 1 iff all cyclic shifts of w are equal 
to w iff y: = y2 =--- = Yp. So there are precisely a orbits of size 1, corresponding to the a 
possible choices for y; in Y. Let k be the number of orbits of size p. Since X is the disjoint 
union of the orbits, a? = |X| = kp +a. So a? —a = kp is a multiple of p, as needed. Oo 


7.117. Cauchy’s Theorem. Suppose G is a finite group and p is a prime divisor of |G]. 
Then there exists an element x € G of order p. 


Proof. As in the previous proof, the group H = ((1,2,...,p)) acts on the set G? by cyclically 
permuting positions. Let X consist of all p-tuples (g1,..., 9p) € G? such that 9i92--- gp =e. 
We can build a typical element of X by choosing g1,...,9p—1 arbitrarily from G; then we 
are forced to choose gp = (91+: Gp—1)* to achieve the condition 91 g2---Gp—1gp = e. The 
Product Rule therefore gives |X| = |G|?~!, which is a multiple of p. 

We next claim that X is an H-stable subset of G?. This means that for 1 <i < p, 
9192°**Gp = € implies gigit1-+* Gp91°+* Gi-1 = e. To prove this, multiply both sides of the 
equation 9192°-- Jp = e by (gig2---gi-1)* on the left and by (gige---gi—1) on the right. 
We now know that X is an H-set, so it is a union of orbits of size 1 and size p. Since |X| 
is a multiple of p, the number of orbits of size 1 must be a multiple of p as well. Now, 
(e,e,...,e) is one orbit of size 1; so there must exist at least p— 1 > 0 additional orbits of 
size 1. By definition of the action of H, such an orbit looks like (a,x,...,a) where x # e. 
By definition of X, we must have x? = e. Since p is prime, we have proved the existence of 
an element «x of order p (in fact, the proof shows there are at least p—1 such elements). O 


7.118. Lucas’s Congruence for Binomial Coefficients. Suppose p,k,n € Zo, p is 
prime, and 0 < k < n. Let n and k have base-p expansions n = )>;.. nip’, k = Vjs9 kip": 


where 0 < nj, kj < p. Then 
n Ni; 
(1) = | | (;) (mod p), (7.4) 


i>0 
where () =1and (f) = 0 whenever b > a. 


Proof. Step 1. For all p,m,j € Zo with p prime, we show that 


7)=(3)0G%) ome 


To prove this identity, let X = {1,2,...,m-+p}, and let Y be the set of all j-element subsets 
of X. By the Subset Rule, |Y| = C: Consider the subgroup G = ((1,2,...,p)) of Sm+p, 
which is cyclic of size p. G acts on Y via gx S = {g(s):s€S} forge Gand Sey. 

Y is a disjoint union of orbits under this action. Since every orbit has size 1 or p, |Y| is 
congruent modulo p to the number M of orbits of size 1. We show that M = () + Gas ie 
The orbits of size 1 correspond to the j-element subsets S of X such that gx S = S$ for 
all g € G. It is equivalent to require that f * S = S for the generator f = (1,2,...,p) of 
G. Suppose S' satisfies this condition, and consider two cases. Case 1: 91 {1,2,...,p} =. 
Since f(x) = x for « > p, we have f x S = S for all such subsets S$. Since S can be an 
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arbitrary subset of the m-element set {p+1,...,p+:m}, there are (’) subsets of this form. 
Case 2: SN {1,2,...,p} #0. Say i € S where 1 <i <p. Applying f repeatedly and noting 
that fx S = S, we see that {1,2,...,p} C S. The remaining j — p elements of S can be 


chosen arbitrarily from the m-element set {p+1,...,p +m}. So there are (,”,) subsets of 
this form. Combining the two cases, we see that M = (7) + (joel 
Step 2. Assume p is prime, a,c > 0, and 0 < b,d < p; we show that eae) = (°) C) 


(mod p). The idea is to use Step 1 and Pascal’s Identity ("{*) = (7) + (,,",). We proceed 
by induction on a. The base step is a = 0. If a= 0 and c > 0, both sides of the congruence 
are zero; if a = 0 = c, then both sides of the congruence are C). Assuming that the result 
holds for a given integer a (and all b,c,d), the following computation shows that it holds 
fora+1: 


(am) = (mer) =(2r8)a( 8) OOO 
[E+ 61Q-C290) on 


Step 3. We prove Lucas’s Congruence (7.4) by induction on n. If k > n, then k; > nj for 
some 7, so that both sides of the congruence are zero. From now on, assume k < n. The 
result holds in the base cases 0 < n < p, since n = no, k = ko, and all higher digits of 
the base p expansions of n and k are zero. For the induction step, note that n = ap+ no, 
k = cp+ko, where a = )0j3) nizip’ and c = )7;, ki41p" in base p. (We obtain a and c 
from n and k, respectively, by chopping off the final base p digits no and ko.) By Step 2 
and induction, we have 


C=C) GG) =H) om 


7.119. Corollary. Given a,b,p € Z>o with p prime and p not dividing }, 


aa) 
p does not divide é . ) : 
Pp 


Proof. Write b = Yiiso bp’ in base p. The base-p expansions of pb and p% are p%b = 
-»+b3b2b1b900---0 and p* = 100---0, respectively, where each expansion ends in a zeroes. 
Since bp 4 0 by hypothesis, Lucas’s Congruence gives 


i = (”) = by £0 (mod p). o 


pe 


This corollary can also be proved directly, by writing out the fraction defining ee a) and 
counting powers of p in the numerator and denominator (see Exercise 7-90). 


7.120. Sylow’s Theorem. Let G be a finite group of size p%b, where p is prime, a > 0, 
and p does not divide b. There exists a subgroup H of G of size p®. 


Proof. Let X be the collection of all subsets of G of size p*. By the Subset Rule, |X| = G ake 
Corollary 7.119 says that p does not divide |X|. G acts on X by left multiplication: g* S = 
{gs:s € S} forg€ Gand S € X. (The set g* S still has size p", since left multiplication 
by g is injective.) Not every orbit of X has size divisible by p, since |X| itself is not divisible 
by p. Choose T € X such that |GT| #0 (mod p). Let H = Stab(T) = {g € G: g*T =T}, 
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which is a subgroup of G. The size of the orbit of T is |G|/|H| = p%b/|H|. This integer is 
not divisible by p, forcing |H| to be a multiple of p*. So |H| > p*. To obtain the reverse 
inequality, let tg be any fixed element of J’. Given any h € H, hx T = T implies htp € T. 
So the right coset Hto = {hto : h € H} is contained in T. We conclude that |H| = |Hto| < 
|T| = p*. Thus H is a subgroup of size p* (and T is in fact one of the right cosets of H). O 


a 


7.15 The Number of Orbits 


The following theorem, which is sometimes called Burnside’s Lemma, allows us to count 
the number of orbits in a given G-set. 


7.121. The Orbit-Counting Theorem. Let a finite group G act on a finite set X. For 
each g € G, let Fix(g) = {a € X : gx = x} be the set of fixed points of g, and let N be the 
number of distinct orbits. Then 


1 : 
N= iq S> | Fix(9)|- 
gEG 
So the number of orbits is the average number of fixed points of elements of G. 


Proof. Define f : X — R by setting f(x) = 1/|Gz| for each x € X. We will compute 
dizex f(x) in two ways. Let {O1,...,On} be the distinct orbits of the group action. On 
one hand, by grouping summands based on which orbit they are in, we get 


Li@=SVvrO=V¥d or= et 


cEeX i=1 xEO; i=1 2EO; 


On the other hand, the Orbit Size Theorem 7.109 says that |Gx| = |G|/|Stab(«)|. Recall 
the notation x(gx = x) = 1 if gz = x, and y(gx = x) = 0 otherwise (see Definition 4.23). 
We compute: 


_ |Stab(x)| 1 _ 
> f@) = Fe a ap te 2 XG? = 2) 


weEx cEX wEX geEG 


1 1 
= Ta oe De Kg = #) = eq DI Fix(g)!. O 


geEG rex gEG 


We are finally ready to solve the counting problems involving symmetry that were men- 
tioned in the introduction to this chapter. The strategy is to introduce a set of objects X 
on which a certain group of symmetries acts. Each orbit of the group action consists of a 
set of objects in X that are identified with one another when symmetries are taken into 
account. So the solution to the counting problem is the number of orbits, and this number 
may be calculated by the formula of the previous theorem. 


7.122. Example: Counting Necklaces. How many ways can we build a five-bead circular 
necklace if there are seven available types of gemstones (repeats allowed) and all rotations 
of a given necklace are considered equivalent? We can model the set of necklaces (before 
accounting for symmetries) by the set of words X = {(y1, y2, y3, Y4; Ys) : 1 < yi < 7}. Now 
let G = ((1,2,3,4,5)) act on X by cyclically permuting positions (see Example 7.70). Every 
orbit of G consists of a set of necklaces that are identified with one another when symmetry 
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is taken into account. To count the orbits, let us compute | Fix(g)| for each g € G. First, 
id = (1)(2)(3)(4)(5) fixes every object in X, so | Fix(id)| = |X| = 7° by the Product Rule. 
Second, the generator g = (1, 2,3,4,5) fixes (y1, yo, ys, ya, Ys) iff 


(Y1, Y2sY3s Yas YS) = (Y5s 1, Yo, Y3, Y4)- 


Comparing coordinates, this holds iff y; = y2 = y3 = ys = ys. So | Fix((1, 2,3,4,5))| = 7 
since there are seven choices for y;, and then yo, y3, ya, and ys are determined. Next, what 
is | Fix(g?)|? We have g? = (1,3,5,2,4), so that g? fixes (y1, y2, y3, Ya, Ys) iff 


(Yi, Y2,Y3> 4s ¥5) = (Yas Y5, Yrs Yo, Y3), 


which holds iff y, = y3 = ys = y2 = ya. So | Fix(g?)| = 7. Similarly, | Fix(g?)| = | Fix(g*)| = 
7, so the answer is 
P+T+ATHI+7 
) 


Now suppose we are counting six-bead necklaces, identifying all rotations of a given 
necklace. Here, the group of symmetries is 


= 3367. 


G = {id, (1,2,3,4,5,6), (1,3,5)(2,4,6), (1,4)(2,5)(3,6), (1,5,3)(2,6,4), (1,6,5,4,3, 2)}. 


As before, id has 7° fixed points, and each of the two 6-cycles has 7 fixed points. What is 
Fix((1,3,5)(2,4,6))? We have 


(1,3, 5)(2, 4, 6) * (y1, Yo, ¥35 Y4, U5, 6) = (Y5s Yo, Y1s Y25 Y3s Y4)s 


and this equals (y1,..., ye) iff y1 = ys = ys and y2 = ys = ye. Here there are 7 choices 
for y1, 7 choices for y2, and the remaining y;’s are then forced. So | Fix((1,3,5)(2,4,6))| = 
77. Likewise, | Fix((1,5,3)(2,6,4))| = 7?. Similarly, we find that (y,...,y6) is fixed by 
(1, 4)(2,5)(3, 6) iff y1 = ya and y2 = ys and ys = ye, so that there are 7? such fixed points. 
In each case, Fix(f) turned out to be 7°%°(/) where cyc(f) is the number of cycles in a 
complete cycle notation for f (including 1-cycles). The number of necklaces is 


P4+74 74 P4747 


= 19,684. 
6 7 


Now consider the question of counting five-bead necklaces using q types of beads, where 
rotations and reflections of a given necklace are considered equivalent. For this problem, 
the appropriate group of symmetries is the automorphism group of the cycle graph Cs 
(see Example 7.46). In addition to the five powers of (1,2,3,4,5), this group contains the 
following five permutations corresponding to reflections of the necklace: 


(1, 5)(2,4)(3), (1,4)(2,3)(5), (1,8)(4,5)(2), (1, 2)(8, 5)(4), (2, 5)(3, 4)(1). 


It can be checked that each of the five new permutations has g? = q°¥°“/) fixed points. 
For example, a necklace (yi,...,y5) is fixed by (1,5)(2,4)(3) iff yi: = ys and y2 = y4 and 
y3 is arbitrary. Thus, we may build such a fixed point by choosing y; and yp and ys (q 
choices each), and then setting y4 = yg and ys = yi. Using the Orbit-Counting Theorem, 
the number of necklaces is 
¢ + 5q? + 4q4 
10 


The next example can be used to solve many counting problems involving symmetry. 


314 Combinatorics, Second Edition 


7.123. Example: Counting Colorings under Symmetries. Suppose V is a finite set of 
objects, Q is a finite set of g colors, and G < Sym(V) is a group of symmetries of the objects 
V. (For example, if V is the vertex set of a graph, we could take G to be the automorphism 
group of the graph.) G acts on V via g-a = g(x) for g € Gand x € V. Now let X be the set 
of all functions f : V > Q. We think of a function f as a coloring of V such that x receives 
color f(x) for all « € V. As we saw in Example 7.69, G acts on X via g* f = fog for 
g€ Gand f € X. Informally, if f assigns color c to object x, then g *« f assigns color c to 
object g(a). The G-orbits consist of colorings that are identified when we take into account 
the symmetries in G. So the number of colorings up to symmetry is Tel Ygea | Fix(g)|- 


In the previous example, we observed that | Fix(g)| = q°"%. To see why this holds in 
general, let g € G have a complete cycle notation g = CC 2---C,, where k = cyc(g). Let 
V, be the elements appearing in cycle Ci, so V is the disjoint union of the sets V;. Consider 
C, for example. Say C) = (#1,22,...,@5), so that Vi = {a1,...,x25}. Suppose f € X is 
fixed by g, so f =g* f. Then 


f(w2) = (9 * f)(w2) = f(g *(a2)) = f(21). 


Similarly, f(#3) = f(x), and in general f(v;41) = f(a;) for all j with 1 < 7 < s. It follows 
that f is constant on V;. Similarly, f is constant on every V; in the sense that f assigns the 
same color to every « € V;. This argument is reversible, so Fix(g) consists precisely of the 
colorings f € X that are constant on each V;. To build such an f, choose a common color 
for all the vertices in V; (for 1 < i < k). By the Product Rule, | Fix(g)| = q* = q°%° as 
claimed. Therefore, the answer to the counting problem is 


1 
a So gt), (7.6) 


gEG 


7.124. Example: Counting Chessboards. We now answer the question asked at the 
beginning of this chapter: how many ways can we color a 5 x 5 chessboard with seven 
colors, if all rotations and reflections of a given colored board are considered the same? We 
apply the method of the preceding example. Let B = (V, E) be the graph that models the 
chessboard (see Figure 7.3). Let Q = {1,2,...,7} be the set of colors, and let X be the 
set of colorings before accounting for symmetry. The symmetry group G = Aut(B) was 
computed in Example 7.48. By inspecting the cycle notation for the eight elements g € G, 
the answer follows from (7.6): 


747747134774 4.715 


3 = 167,633,579,843 887,699,759. 


DS 


7.16 Pdlya’s Formula 


Consider the following variation of the chessboard coloring example: how many ways can 
we color a 5 x 5 chessboard so that 10 squares are red, 12 are blue, and 3 are green, if 
all rotations and reflections of a colored board are equivalent? We can answer questions 
like this with the aid of Pélya’s Formula, which extends the Orbit-Counting Theorem to 
weighted sets. 

Let a finite group G act on a finite set X. Let {O1,...,On} be the orbits of this action. 
Suppose each « € X has a weight wt(x) that is a monomial in variables z,...,2z,. Also 
assume that the weights are G-invariant: wt(g * x) = wt(x) for all g € Gand all xz € X. 
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This condition implies that every object in a given orbit has the same weight. So we can 
assign a well-defined weight to each orbit by letting wt(O;) = wt(a;) for any 2; € O;. The 
next result lets us compute the generating function for the set of weighted orbits. 


7.125. The Orbit-Counting Theorem for Weighted Sets. With the notation of the 


preceding paragraph, 
N 
LMo=_L LT we 
i=1 


geG xeEFix(g 


So, the weighted sum of the orbits is the average over : of the weighted fixed point sets of 
elements of G. 


Proof. We adapt the proof of the original Orbit-Counting Theorem 7.121 to include weights. 
Define f : X > R[z1,..., 2g] by setting f(x) = wt(x)/|Gz| for each x € X. On one hand, 


Two vw-LyM-yy 


xrEx i=l rEO; i=l rEO; i=1 rEO; 


) N 


= > wt(Oi), 


wt(O; 
|Oi| 


On the other hand, using |Gz| = |G|/| Stab(x)|, we get 


Efe) = RN = YD lar = 2) wt) 


rex cEX leew oe 


= wbx x(gx = x) wt(a =qd oe wt(x 


geGurEx g€G xEFix(g) 


We now extend the setup of Example 7.123 to count weighted colorings. We are given 
finite sets V and Q = {1,...,q}, a subgroup G of Sym(V), and the set of colorings X 
consisting of all functions f : V + Q. G acts on X by permuting the domain: g* f = fog7! 
for g € Gand f € X. We define a weight for a given coloring by setting 


wt(f) = II Z f(a) € R[21, 22, i 25 Zell: 
zEV 


In other words, wt(f) = z{!-+- 2g? where e; is the number of objects in V that f colors i. 
By making the change of variables v = g~'(a), we see that 


wt(g* f) = [J 2p-1@) = [] zr@ =wt(f) for allg € G and f eX. 
vEV vEeV 


So the Orbit-Counting Theorem for Weighted Sets is applicable. In the unweighted case 
(see Example 7.123), we found that | Fix(g)| = q°%°% by observing that f € Fix(g) must 
be constant on each connected component Vj,...,V; of the digraph of the permutation g. 

To take weights into account, let us construct colorings f € Fix(g) using the Product 
Rule for Weighted Sets. Suppose the components V,,..., Vz in the digraph of g have sizes 
ny > Nz >-++> ng, so that type(g) = (m1, n2,...,n~). First choose a common color for the 
ny vertices in V;. The generating function for this choice is zt + zg +---+ 271; the term 
z;' arises by coloring all n; vertices in V; with color 7. Second, choose a common color for 
the nz vertices in V2. The generating function for this choice is z'? +--++ z,°. Continuing 
similarly, we arrive at the formula 


k 


> wt(a) = The +238 ee tz). 


«€Fix(g) i=1 


316 Combinatorics, Second Edition 


We can abbreviate this formula by introducing the power-sum polynomials (which are stud- 
ied in more detail in Chapter 9). For each integer k > 1, set py(z1,-.-,Zq) = 2h +28+:- +25. 


For each integer partition = (f11,2,..-,Ue), Set pu(Z1,.--,2q) = ea Due (Zija ery Sq) 
Then the weighted orbit-counting formula assumes the following form. 


7.126. Pdlya’s Formula. With the above notation, the generating function for weighted 
colorings with q colors relative to the symmetry group G is 


N 
1 

S > wt(Oi) = fell YS Papel (21) 2a5+++5%q)- 

i=1 


gEG 


The coefficient of z{1 --- 29% in this polynomial is the number of colorings (taking the sym- 
metries in G into account) in which color i is used e; times, for 1 <i <q. 


7.127. Example. The generating function for five-bead necklaces using q types of beads 
(identifying all rotations and reflections of a given necklace) is 


(p(1,1,1,1) + 4P(5) + 5p2,2,1))/10, 
where all power-sum polynomials involve the variables z,..., Z¢. 


7.128. Example. Let us use Polya’s Formula to count 5 x 5 chessboards with 10 red 
squares, 12 blue squares, and 3 green squares. We may as well take q = 3 here. Consulting 
the cycle decompositions in Example 7.48, we find that the group G has one element of 
type (17°) = (1,1,---,1), two elements of type (4°,1), one element of type (217,1), and 
four elements of type (2, 15). Therefore, >”, wt(O;) is 


P1258) (21, 22, 23) + 2pcas1) (21; 22, 23) + p2i21) (21, 22; 23) + 4p(210 15) (21, 22, 23) 
3 . 
Using a computer algebra system, we can compute this polynomial and extract the coeffi- 
cient of zj°z4?z3. The final answer is 185,937,878. 


DS 


Summary 


Table 7.3 summarizes some definitions from group theory used in this chapter. Table 7.4 
contains definitions pertinent to the theory of group actions. 


e Examples of Groups. (i) Additive commutative groups: Z, Q, R, C, and Z, (the integers 
mod n); (ii) Multiplicative commutative groups: invertible elements in Z, Q, R, C, and 
Zn; (iii) Non-commutative groups: invertible matrices in M;,,(R), the group Sym(X) of bi- 
jections on X under composition, dihedral groups (automorphism groups of cycle graphs); 
(iv) Constructions of groups: product groups (Exercise 7-6), subgroups, quotient groups 
(Exercise 7-55), cyclic subgroup generated by a group element, automorphism group of a 
graph, automorphism group of a group. 


e Basic Properties of Groups. The identity of a group is unique, as is the inverse of each 
group element. In a group, there are left and right cancellation laws: (ax = ay) > (x = y) 
and (aa = ya) => (x = y); inverse rules: (e~!)-! = a and (a, ---a)~1 = a,!---a7'; and 

My, 


the Laws of Exponents: 27" = 2°"2"; (2™)" = 2"; and, when zy = yx, (xy)” = 2" y". 
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TABLE 7.3 
Definitions in group theory. 


Concept Definition 


Va,y,2 € G,x(yz) = (xy)z (associativity) 
de € G,Vx € G,xe = x = ex (identity) 
Va € G, dy € G,xy = e = yx (inverses) 
group G with ry = yz for allz,yEG 
eg € H (closure under identity) 
Va,b € H,ab © H (closure under operation) 
Va € H,a~! € H (closure under inverses) 
Vg € G,Vh € H,ghg"! € H (closure under conjugation) 
Sie an Sa, a” Sle)" aS 0) 
Ox = 0g, (n+ l1l)a=na4+a, (—n)x = n(—2) (n> 0) 


group axioms 


commutative group 


His a subgroup of G 


H is normal in G (H <G) 
exponent notation 
multiple notation 


k-cycle permutation of the form (71, i2,--- 7%) (cycle notation) 
transposition 2-cycle (i, 7) 

basic transposition 2-cycle (i,i +1) in Si, 

cyc(f) number of components in digraph of f € Sym(X) 
type(f) list of cycle lengths of f € Sym(X) in decreasing order 


inv(wy +++ Wn) 

sgn(w) for w € S, 

cyclic subgroup (2) 

cyclic group 

order of x EG 

graph automorphism of 
group homomorphism 
kernel of hom. f : G— H 
image of hom. f: G— H 
group isomorphism 

group automorphism 
inner automorphism Cy 


number of 7 <j with w; > w; 
(1) 

{x” : n € Z} or (in additive notation) {nz :n € Z} 
group G such that G = (x) for some z € G 

least n > 0 with x” = eg, or oo if no such n 

bijection on vertex set of K preserving edges of K 
map f:G— AH with f(xy) = f(x) f(y) for alla,yeG 
ker(f) ={x eG: f(x) =en} 

imge(f) ={y€ H: y= f(z) for some x € G} 

bijective group homomorphism 

group isomorphism from G to itself 
automorphism sending x € G to gag7 


1 


e Notation for Permutations. A bijection f € S, can be described in two-line form 
1 Qo es n 
f) FQ) <= f(n) 
The cycle notation is obtained by listing the elements going around each directed cycle 
in the digraph of f, enclosing each cycle in parentheses, and optionally omitting cycles of 
length 1. The cycle notation for f is not unique. 


, in one-line form [f(1), f(2),..., f(n)], or in cycle notation. 


e Sorting, Inversions, and Sign. A permutation w = w,w2-::Wr € S, can be sorted to 
the identity permutation id = 12---n by applying inv(w) basic transpositions to switch 
adjacent elements that are out of order. It follows that w can be written as the composition 
of inv(w) basic transpositions. Any factorization of w into a product of transpositions must 
involve an even number of terms when sgn(w) = +1, or an odd number when sgn(w) = —1. 
Sign is a group homomorphism: sgn(f o g) = sgn(f)-sgn(g) for f,g € S,. The sign of a 
k-cycle is (—1)*-1. For all f € Sn, sgn(f) = (—1)"-9), 


e Properties of Cyclic Groups. Every cyclic group is commutative and isomorphic to Z or 
Z», (under addition) for some n > 1. More precisely, if G = (x) is an infinite multiplicative 
cyclic group, then f : Z > G given by f(i) = x for i € Z is a group isomorphism. If 
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TABLE 7.4 
Definitions involving group actions. 


Concept Definition 
Vg © G,Va © X,g*x © X (closure 


Va € X,eq * x = x (identity) 
Vg, he G,Va € X,g* (h* x) = (gh) * x (assoc.) 
perm. representation of Gon X | group homomorphism R: G + Sym(X) 


action axioms for G-set X 


G-stable subset Y of X Vg € G,Vy € Y,g*y € Y (closure under action) 
orbit of x in G-set X Ga=Geu={gxr:gE Gh 

stabilizer of x rel. to G-set X Stab(z) ={gEG:gex=a2}<G 

fixed points of g in G-set X Fix(g) = {@eE X:gxux=ax} 

conjugacy class of x in G {gtg-':g€G} 

centralizer of x in G Ce(t) ={gEG:gr=19} <G 

center of G Z(G) ={g€G:gr=2g forallreG}JIG 
normalizer of H in G No(H#)={g€ G:gHg !=H}<G 

left coset of H «cH ={ah:he H} 

right coset of H Hx = {hx:he H} 

set of left cosets G/H for H <G, G/H = {#H: xe G} 

index [G : H] [G : H| = |G/H| =number of left cosets of H in G 


G = (2) has size n, then g : Z, — G given by g(i) = z' for i € Z» is a group isomorphism; 
moreover, x” = e iff n divides m. Every subgroup of the additive group Z has the form 
kZ for a unique k > 0. Every subgroup of a cyclic group is cyclic. 


Properties of Group Homomorphisms. If f : G — H is a group homomorphism, then 
ker(f) IG and img(f) < H. Moreover, f(a”) = f(x)” for all € G and n € Z. The 
composition of group homomorphisms (respectively isomorphisms) is a group homomor- 
phism (respectively isomorphism), and the inverse of a group isomorphism is a group 
isomorphism. 


Main Results on Group Actions. Actions « of a group G on a set X correspond bijectively 
to permutation representations R : G > Sym(X), via the formula R(g)(x) = g * x for 
g€ Ganda € X. Every G-set X is the disjoint union of orbits; more precisely, each x € X 
lies in a unique orbit Gx. The size of the orbit Ga is the index (number of cosets) of the 
stabilizer Stab(x) in G, which (for finite G) is a divisor of |G|. The number of orbits is the 
average number of fixed points of elements of G (for G finite); this extends to weighted 
sets where the weight is constant on each orbit. 


Examples of Group Actions. A subgroup H of a group G can act on G by left multiplication 
(h*xx = ha), by inverted right multiplication (h *x« = xh~'), and by conjugation (h* 2 = 
hxh~*). The orbits of « under these respective actions are the right coset Hz, the left 
coset «H, and (when H = G) the conjugacy class of x in G. Similarly, G and its subgroups 
act on the set of all subsets of G by left multiplication, and G acts by conjugation on 
the set of subgroups of G. The set of subsets of a fixed size k is also a G-set under 
these actions. Centralizers of elements and normalizers of subgroups are stabilizers of 
group actions, hence they are subgroups of G. Any subgroup G' of Sym(X) acts on X by 
g* «x = g(x) for g € Gand « € X. For any set X, S, (or its subgroups) acts on X” via 
f + (@1,.++,2n) = (@f-1@1),++-+,£F-1(m)). For any subgroup H of G, G acts on G/H via 
g * (eH) = (gx)H for g,a € G. 
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e Facts about Cosets. Given a subgroup H of a group G, G is the disjoint union of its left 
cosets, which all have the same cardinality as H (similarly for right cosets). This implies 
Lagrange’s Theorem: |G| = |H|-[G : H], so that (for finite G) the order and index of 
any subgroup of G are both divisors of |G|. To test equality of left cosets, one may check 
any of the following equivalent conditions: cH = yH; « € yH; x = yh for some h € H; 
y ‘2 € H; x+y € H. Similarly, Hz = Hy iff zy! € H iff yx € H. Left and right 
cosets coincide (i.c., eH = Hz for all x © G) iff H is normal in G iff all conjugates 
xHa~' equal H iff H is a union of conjugacy classes of G. Given a group homomorphism 
f:G— Lwith kernel K, Kn =axK ={yeG: f(y) = f(x)} for alla eG. 


e Conjugacy Classes. Every group G is the disjoint union of its conjugacy classes, where 
the conjugacy class of x is {gxg~! : g € G}. Conjugacy classes need not all have the same 
size. The size of the conjugacy class of x is the index [G : Cg(x)], where Ce(x) is the 
subgroup {y € G: xy = ya}; this index is a divisor of |G| for G finite. For 2 € G, the 
conjugacy class of x has size 1 iff x is in the center Z(G). So (Exercise 7-92) groups G of 
size p” (where p is prime and n > 1) have |Z(G)| > 1. Each conjugacy class of S;, consists 
of those f € S;, with a fixed cycle type yu. This follows from the fact that a cycle notation 
for gfg~' can be found from a cycle notation for f by replacing each value x by g(a). 
The size of the conjugacy class indexed by pu is n!/z,. 


e Cayley’s Theorem on Permutation Representations. Every group G is isomorphic to a 
subgroup of Sym(G), via the homomorphism sending g € G to the left multiplication map 
L, : G > G given by L,(x) = gx for « € G. Every n-element group is isomorphic to a 
subgroup of Si. 


e Theorems Provable by Group Actions. (i) Fermat’s Little Theorem: a? = a (mod p) for 
a € Zso and p prime. (ii) Cauchy’s Theorem: If G is a group and p is a prime divisor 
of |G|, then there exists « € G of order p. (iii) Lucas’s Congruence: For 0 < k < n and 
prime p, (@) = [Iso Gaal (mod p), where the n; and k; are the base-p digits of n and k. 
(iv) Sylow’s Theorem: If G is a group and |G| has prime factorization |G| = py! --- pz", 
then G has a subgroup of size p;* for 1 <i<k. 


e Counting Colorings under Symmetries. Given a finite set V, a group of symmetries G < 
Sym(V), and a set Q of q colors, the number of colorings f : V > Q taking symmetries into 
account is |G|~? ge g%“9), If the colors are weighted using 21,..., 2, the generating 
function for weighted colorings is given by Poélya’s Formula 


1 
tell x, Peywatay (214 tees 2q); 
gEG 
where p, = [J],(9°4_, 24) is a power-sum symmetric polynomial. The coefficient of 
B a j=l 9 


é. . . . . . . 
zi'+++2Zq" gives the number of colorings (taking the symmetries in G into account) where 
color 7 is used e; times. 


Exercises 


7-1. Let X be a set with more than one element. Define axb = 6 for all a,b € X. (a) Prove 
that X satisfies the closure axiom and associativity axiom in Definition 7.1. (b) Does there 
exist e € X such that ex x = x for all x € X? If so, is this e unique? (c) Does there exist 
e€ X such that «xe =x for all x € X? If so, is this e unique? (d) Is X a group? 
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7-2. Let G be the set of odd integers. For all x,y € G, define xx y =x+y+5. Prove that 
G is a commutative group. 


7-3. Let G be the set of real numbers unequal to 1. For each a,b € G, define axb = a+b—ab. 
Prove that G is a commutative group. 


7-4. Assume G is a group such that «* x =e for all x € G, where e is the identity element 
of G. Prove that G is commutative. 


7-5. Let G be a group with operation x. Define a new operation e : G x G > G by setting 
aeb=bxa for all a,b € G. Prove that G is a group using the operation e. 


7-6. Product Groups. Let G and H be groups with operations « and e, respectively. 
(a) Show that G x H is a group using the operation (91, h1) * (ga, ho) = (gi * g2, hie he) for 
91,92 € G, hi, hg € H. (b) Show G x A is commutative iff G and H are commutative. 
7-7. Prove that the operation © on Z, satisfies the associative axiom by verifying the 
relations (7.1). [Hint: One approach is to use the fact that for all u,v € Zp, there exists 
keEZwithu@v=ut+v-—kn 

7-8. Suppose G is a set, x: G x G — G is associative, and there exists e € G such that for 


alla € G,exxz =a and there is y € G with yx z =e. Prove G is a group. 


7-9. For x, y ina group G, define the commutator [x,y] = xyx~ty~', and let C,(y) = ryr~t. 


Verify that the following identities hold for all x,y,z € G. 

(a) [z,y]~* = [y, 2] 

(b) [z, yz] = |x, y]Cy ([z, 2]) 

(c) [x, yz][y, za][z, xy] = ec 

(d) [[z, y], Cy (2)Illy, 2], Ce(@)][lz, 2], C2 (y)] = ee 

7-10. Give complete proofs of the three Laws of Exponents in Theorem 7.10. 

7-11. Let G be a group. For each g € G, define a function Rz : G — G by setting 
R,(x) = xg for each x € G. Rg is called right multiplication by g. (a) Prove that Ry is a 
bijection. (b) Prove that R. = idg (where e is the identity of G) and Ryo R;, = Rng for 
all g,h € G. (c) Point out why Ry is an element of Sym(G). Give two answers, one based 
on (a) and one based on (b). (d) Define ¢ : G > Sym(G) by setting ¢(g) = R, for g € G. 
Prove that ¢ is one-to-one. (e) Prove that for all g,h € G, Lgo Rp, = Rpo Ly, (where L, is 
left multiplication by g). 

7-12. Let G be a group. (a) Prove that for all a,b € G, there exists a unique x € G with 
ax = b. (b) Prove the Sudoku Theorem: in the multiplication table for a group G, every 
group element appears exactly once in each row and column. 


7-13. A certain group G has a multiplication table that is partially given here: 


123 4 
4 


Be wr re] + 
m 


Use properties of groups to fill in the rest of the table. 

7-14. Suppose G = {u,v,w,x,y} is a group such that eg = w, yxy = x, andaxvu=y. 
Use this information to find (with explanation) the complete multiplication table for G. 
7-15. Suppose G = {e, f,g,h,p,r} is a group such that exp = p, f = f-',g=g"', 
p=r_',hxf =p, and gxp = h. Find (with explanation) the complete multiplication table 
for G. 


7-16. Let f,g € Ss be given in one-line form by f = [3,2,7,5,1,4,8,6] and g = 
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[4,5,1,3,2,6,8, 7]. (a) Write f and g in cycle notation. (b) Compute fog, gof, gog, and 
f—', giving final answers in one-line form. 

7-17. Let f,g © Sg be given in cycle notation by f = (3,2,7,5)(4,8,6) and g = 
(4,5,1,3,2,6,8,7). (a) Write f and g in one-line form. (b) Compute fog, gof, gog, and 
f—', giving final answers in cycle notation. 

7-18. Find all cases where [fi fo --: fn] € Sn (given in one-line form) is equal to 
(fi, fo,---,; fn) (given in cycle notation). 

7-19. Let h = [4,1,3,6,5,2] in one-line form. Compute inv(h) and sgn(h). Write h as a 
product of inv(h) basic transpositions. 

7-20. Let f = (1,3, 6)(2,8)(4)(5, 7) and g = (5, 4,3, 2,1)(7,8). (a) Compute fg, gf, fof’, 
and gfg~', giving all answers in cycle notation. (b) Compute sgn(f) and sgn(g) without 
counting inversions. (c) Find an h € Sg such that hfh~! = (1,2,3)(4,5)(6)(7, 8); give the 
answer in two-line form. 

7-21. Suppose that f € S, has cycle type 4 = (f1,..., Uz). What is the order of f? 

7-22. The support of a bijection f € Sym(X) is the set supp(f) = {« € X : f(x) 4 a}. 
Two permutations f,g € Sym(X) are called disjoint iff supp(f) N supp(g) = 0. (a) Prove 
that for all « € X and f € Sym(X), x € supp(f) implies f(x) € supp(f). (b) Prove that 
disjoint permutations commute, i.e., for all disjoint f,g € Sym(X), fog = gof. (c) Suppose 
f € Sym(X) is given in cycle notation by f = C,C2---C,, where the C; are cycles involving 
pairwise disjoint subsets of X. Show that the C;’s commute with one another, and prove 
carefully that f = C1 0 C2 0---0 Cy (see Remark 7.19). 

7-23. Prove Lemma 7.27. 

7-24. (a) Verify the formula (i1, 72,...,%%) = (%1, 72) © (42,73) © (43, i4) 0+ ++ (¢h-1, 7%) used in 
the proof of Theorem 7.33. (b) Prove that every transposition has sign —1 by finding an 
explicit formula for (i, 7) as a product of an odd number of basic transpositions (which have 
sign —1 by Lemma 7.26 with w = id). 

7-25. Given f € S,, how are the one-line forms of f and f o (i, 7) related? How are the 
one-line forms of f and (7,7) 0 f related? 

7-26. Given f € S,, how are the one-line forms of f and fo (1,2,...,n) related? How are 
the one-line forms of f and (1,2,...,n) 0 f related? 

7-27. Let f € S, and h = (i,i+1)0 f. (a) Prove an analogue of Lemma 7.26 relating 
inv(f) to inv(h) and sgn(f) to sgn(h). (b) Use (a) to give another proof of the formula 
sgn(f og) =sgn(f) sgn(g) that proceeds by induction on inv(f). 

7-28. Prove that for all n > 3, every f in the alternating group A, (see Example 7.59) can 
be written as a product of 3-cycles. 


7-29. Prove that for all n > 2, every f € S;, can be written as a composition of factors, 
where each factor is (1,2) or (1,2,...,7). Give an algorithm, based on the one-line form of 
f, for finding such a factorization. ustrate this algorithm for f = 36241875. 

7-30. For which choices of n and & can every f € S,, be written as a product of factors, 
where each factor is (1,k) or (1,2,...,n)? 

7-31. Verify all the assertions in Example 7.38. 

7-32. Let x be an element of a group G, written multiplicatively. Use the Laws of Exponents 
to verify that (7) = {x” : n € Z} is a subgroup of G. 

7-33. The Subgroup Generated by a Set. Let S be a nonempty subset of a group 
G. Let (S) be the set of elements of G of the form 2122---2,, where n € Zyo and, for 
1<i<n, either x; € S or a € S. Prove that (S) < G, and for all T with S CT <G, 
(is) <7. 
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7-34. Prove that every subgroup of a cyclic group is cyclic. 


7-35. For subsets S and T of a multiplicative group G, define ST = {st: 5 € S,t € T}. 
(a) Show that if SG and T < G, then ST = TS and ST < G. Give an example to show 
ST may not be normal in G. (b) Show that if SG and T<G, then ST IG. (c) Give an 
example of a group G and subgroups S' and T such that ST is not a subgroup of G. 


7-36. Let S and T be finite subgroups of a group G. Prove that |S|-|T| =|ST|-|S AT]. 
7-37. Assume that G is a group and H < G. Let H-1 = {h-1:h € H}. (a) Show that 
HH =H-!=HH~'=H. (b) Prove that HAG iff gHg-1 = H for allg EG. 

7-38. Show that a subgroup H of a group G is normal in G iff H is a union of conjugacy 
classes of G. 


7-39. Find all the subgroups of S4. Which subgroups are normal? Confirm that Sylow’s 
Theorem 7.120 is true for this group. 


7-40. Find all normal subgroups of $5, and prove that you have found them all (Lagrange’s 
Theorem and Exercise 7-38 can be helpful here). 


7-41. Suppose H is a finite, nonempty subset of a group G such that zy € A for allz,y € H. 
Prove that H < G. Give an example to show this result may not be true if H is not finite. 


7-42. Given any simple graph or digraph K with vertex set X, show that Aut(Ic) is a 
subgroup of Sym(X). 

7-43. Determine the automorphism groups of the following graphs and digraphs: (a) the 
path graph P, with vertex set {1,2,...,n} and edges {1,2}, {2,3},...,{m—1,n}; (b) the 
complete graph K,, with vertex set {1,2,...,n} and an edge between every pair of distinct 
vertices; (c) the empty graph on {1,2,...,n} with no edges; (d) the directed cycle with 
vertex set {1,2,...,n} and edges (1, 2), (2,3),...,(m—1,n), (n, 1); (e) the graph with vertex 
set {1,+2,...,+n} and edge set {{i,-i} :1<i<n}. 

7-44. Let K be the Petersen graph defined in Exercise 3-101. (a) Given two paths P = 
(yo; ¥1, Y2, ¥3) and Q = (Zz, 21, 22, 73) in K, prove that there exists a unique automorphism 
of K that maps y; to 2; for 0 <i < 3. (b) Prove that K has exactly 5! = 120 automorphisms. 
(c) Is Aut(A’) isomorphic to $5? 

7-45. Let Q; be the simple graph with vertex set V = {0,1}* and edge set E = {(v,w) € 
V : v,w differ in exactly one position}. Q, is called a k-dimensional hypercube. (a) Compute 
\V(Qx)|, |E(Qx)|, and deg(Q;,). (b) Show that Q, has exactly (*)2*~? induced subgraphs 
isomorphic to Q;. (c) Find all the automorphisms of Q;. How many are there? 


7-46. (a) Construct an undirected graph whose automorphism group has size three. What 
is the minimum number of vertices in such a graph? (b) For each n > 1, construct an 
undirected graph whose automorphism group is cyclic of size n. 

7-47. Let G be a simple graph with connected components C),...,C;. Assume that C; is 
not isomorphic to C; for all i 4 7. Show that Aut(G) is isomorphic to the product group 
Aut(C1) Re Aut(C,). 

7-48. Let f : G — H be a group homomorphism. (a) Show that if K < G, then f[K] = 
{f(z) : « € K} is a subgroup of H. If K IG, must f[K] be normal in H? (b) Show 
that if L < H, then f-+[L] = {a € G: f(x) € L} is a subgroup of G. If L SH, must 
f~*[L] be normal in G? (c) Deduce from (a) and (b) that the kernel and image of a group 
homomorphism are subgroups. 

7-49. Show that the group of nonzero complex numbers under multiplication is isomorphic 
to the product of the subgroups Ryo and {z € C: |z| = 1}. 

7-50. Give examples of four non-isomorphic groups of size 12. 


7-51. Suppose G is a commutative group with subgroups H and K, such that G = Hk 
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and HM K = {eg}. (a) Prove that the map f : H x K —> G, given by f(h,k) = hk for 
h€ H andk € K, is a group isomorphism. (b) Does any analogous result hold if G is not 
commutative? What if H and Kk are normal in G? 


7-52. (a) Let G be a group and x € G. Show there exists a unique group homomorphism 
f:Z—G with f(1) =<. (b) Use (a) to determine the group Aut(Z). 

7-53. (a) Suppose G is a group, x € G, and x” = eg for some n > 2. Show there exists a 
unique group homomorphism f : Z, + G with f(1) = x. (b) Use (a) to prove that Aut(Z,) 
is isomorphic to the group Z* of invertible elements of Z, under multiplication modulo n. 
7-54. Properties of Order. Let G be a group and x € G. (a) Prove x and x! have the 
same order. (b) Show that if x has infinite order, then so does x’ for all nonzero integers i. 
(c) Suppose « has finite order n. Show that the order of x* is n/ ged(k,n) for all k € Z. 
(d) Show that if f : G— H is a group isomorphism, then x and f(a) have the same order. 
(e) What can be said in part (d) if f is only a group homomorphism? 

7-55. Quotient Groups. (a) Suppose H is a normal subgroup of G. Show that the set 
G/H of left cosets of H in G becomes a group of size [G : H] if we define (wxH) * (yH) = 
(xy)H for all x,y € G. (One must first show that this operation is well-defined: i.e., for all 
11,%2,91,y2 © G, 14H = 22H and yA = ywH imply 11y,H = xoy2H. For this, use the 
Coset Equality Theorem.) (b) With the notation in (a), define 7: G > G/H by x(a) = «H 
for « € G. Show that 7 is a surjective group homomorphism with kernel H. 

(c) Let H = {id, (1,2)} < $3. Find 1,%2,Y1, 42 € S3 with 21H = r2H and yi = yoH, 
but wy: 4 xey2H. This shows that normality of H is needed for the product in (a) to 
be well-defined. 


7-56. Let H be a normal subgroup of a group G. (a) Prove that G/H is commutative if G 
is commutative. (b) Prove that G/H is cyclic if G is cyclic. (c) Does the converse of (a) or 
(b) hold? Explain. 

7-57. The Fundamental Homomorphism Theorem for Groups. Suppose G and H 
are groups and f : G > H is a group homomorphism. Let K = {a € G: f(x) =e} be the 
kernel of f, and let J = {y € H: dx € G,y = f(x)} be the image of f. Show that kK JG, 
I < H, and there exists a unique group isomorphism f : G/K — I given by f(aK) = f(z) 
for 7 € G. 

7-58. The Universal Mapping Property for Quotient Groups. Let G be a group 
with normal subgroup N, let 7: G —> G/N be the homomorphism 7(a) = «N for x € G, 
and let H be any group. (a) Show that if h : G/N > H is a group homomorphism, then 
hoz is a group homomorphism from G to H sending each n € N to ey. (b) Conversely, 
given any group homomorphism f : G — AH such that f(n) = eg for all n € N, show that 
there exists a unique group homomorphism h : G/N — H such that f = hoz. (c) Conclude 
that the map sending h to hoz is a bijection from the set of all group homomorphisms 
from G/N to H to the set of all group homomorphisms from G to H that map everything 
in N to ey. 


7-59. The Diamond Isomorphism Theorem for Groups. Suppose G is a group, SIG, 
and T < G. Show that TS = ST < G, SATS, (SNT)<T, and there is a well-defined 
group isomorphism f :T/(ST) — (T'S)/S given by f(a(SNT)) = aS for all « € T. Use 
this to give another solution to Exercise 7-36 in the case where S is normal in G. 

7-60. The Double-Quotient Isomorphism Theorem for Groups. Assume A < 
B < C are groups with A and B both normal in C. Show that Ad B, B/A C/A, 
and (C/A)/(B/A) is isomorphic to C/B via the map sending («A)B/A to «B for « € C. 
7-61. The Correspondence Theorem for Quotient Groups. Let H be a normal 
subgroup of a group G. Let X be the set of subgroups of G containing H, and let Y be 
the set of subgroups of G/H. Show that the map sending L € X to L/H = {tH : a € L} 
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is an inclusion-preserving bijection from X onto Y with inverse map sending M € Y to 
{x €G: «aH € M}. If L maps to M under this correspondence, show [G : L] = [G/H : M], 
[L: H] =|M|, LAG iff MI G/H, and G/L is isomorphic to (G/H)/M whenever LG. 
7-62. Let G be a non-commutative group. Show that the rule g-x = xg (for g,xa € G) does 
not define a (left) action of G on the set G. 

7-63. Let G act on itself by conjugation: g* x = gxg~' for g,x € G. Verify that the axioms 
for a group action are satisfied. 

7-64. Let X be a G-set with action *, and let f : K — G be a group homomorphism. Verify 
the K-set axioms for the action given by ke a = f(k)*a for allk ¢ K andxwe X. 

7-65. Suppose « : G x X — X is a group action. (a) Show that P(X) is a G-set via the 
action ge S = {gxs:s¢€S} forge Gand S € P(X). (b) For fixed k, show that the set of 
all k-element subsets of X is a G-stable subset of P(X). 

7-66. Verify the action axioms for the action of S;, on V in Example 7.68. 

7-67. Suppose X is a G-set with action *, and W is a set. Show that the set of functions 
F:W > X is a G-set via the action (ge F')(w) = g* (F(w)) for all g € G, F € Fun(W, X), 
and w € W. 

7-68. Let a subgroup H of a group G act on G viah*x = xh"! forh € H andwe€ G. 
Show that the orbit H * x is the left coset xH, for all x € G. 

7-69. (a) Suppose f : X —> Y is a bijection. Show that the map T : Sym(X) > Sym(Y) 
given by T(g) = fogof + for g € Sym(X) is a group isomorphism. (b) Use (a) and Cayley’s 
Theorem to conclude that every n-element group is isomorphic to a subgroup of S,. 

7-70. Let a group G act on a set X. Show that (),,-x Stab(x) is a normal subgroup of G. 
Give an example to show that a stabilizer subgroup Stab(a) may not be normal in G. 
7-71. Let G act on itself by conjugation. (a) By considering the associated permutation 
representation and using the Fundamental Homomorphism Theorem, deduce that G/Z(G) is 
isomorphic to the subgroup of inner automorphisms in Aut(G). (b) Show that the subgroup 
of inner automorphisms is normal in Aut(G). 


7-72. Let the additive group R act on the set of column vectors R? by the rule 


Pall Sl coer =a - for all 6,z,y ER. 
y sin@ —_cos@ y 


Verify that this is a group action, and describe the orbit and stabilizer of each point in R?. 
7-73. Let f € S,, and let (f) act on {1,2,...,n} via g- a = g(x) for all g € (f) and 
x € {1,2,...,n}. Prove that the orbits of this action are the connected components of the 
digraph of f. 

7-74. Suppose X is a G-set and x,y © X. Without appealing to equivalence relations, give 
a direct proof that Gxn Gy 4 0 implies Ga = Gy. 

7-75. Let « be a right action of a group G ona set X. (a) Prove that X is the disjoint union 
of orbits 7 * G. (b) Prove that |x * G| = [G : Stab(«)], where Stab(z) = {g € G: a*g =x}. 
7-76. State and prove a version of the Coset Equality Theorem 7.92 for right cosets. 
7-77. Let G be a group with subgroup H. Prove that the map T(xH) = Ha~! forr eG 
is a well-defined bijection from the set of left cosets of H in G onto the set of right cosets 
of H in G. 

7-78. Let X bea G-set. For x € X and g € G, prove that gStab(x) = {hE G: hxx = gxa}. 
(This shows that each left coset of the stabilizer of x consists of those group elements sending 
x to a particular element in its orbit Gz. Compare to Theorem 7.101.) 
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7-79. Let G be a group with subgroup H. Prove the following facts about the normalizer 
of H in G (see Example 7.107). (a) Ng(H) contains H; (b) H <I Ne(A); (c) for any Lb< G 
such that H IL, L < Ne(H); (d) HAG iff Nc(H) =G. 

7-80. Let X be a G-set. Prove: for all g € G and x € X, Stab(gxr) = g Stab(x)g7'. 


7-81. Let H and K be subgroups of a group G. Prove that the G-sets G/H and G/K are 
isomorphic (as defined in Remark 7.110) iff H and K are conjugate subgroups of G (i.e., 
K =gHg"' for some g € G). 

7-82. Calculate z,, for all integer partitions 4 with |u| = 6. 

7-83. List all elements in the centralizer of g = (2, 4,7)(1,6)(3,8)(5) € Sg. How large is this 
centralizer? How large is the conjugacy class of g? 

7-84. Suppose f = (2,4, 7)(8, 10, 15)(1,9)(11, 12)(17, 20)(18, 19) and 

g = (7,8,9)(1, 4, 5)(11, 20)(2, 6)(3, 18)(13, 19). How many h € $29 satisfy ho f =goh? 
7-85. Find all integer partitions y of n for which z, = n!. Use your answer to calculate 
Z(S,) for alln > 1. 


7-86. Prove that for all n > 1, p(n) = = res, Ztype(f): 


7-87. Conjugacy Classes of A,,. For f € An, write [f]4, to denote the conjugacy class 
of f in A,, and write [f]s, to denote the conjugacy class of f in S,. (a) Prove: for all 
f € An, [fla, © [f]s,- (b) Prove: for all f € An, if there exists g € S,—A,p with fg = gf, 
then [f]4, = [f]s,; but if no such g exists, then [f]s, is the disjoint union of [f]4, and 
[(1,2)o fo(1,2)]4,, and the latter two conjugacy classes are equal in size. (c) What are the 
conjugacy classes of As? How large are they? 


7-88. Prove that the only normal subgroups of As are {id} and As (use part (c) of the 
previous exercise). 


7-89. Suppose G is a finite group and p is a prime divisor of |G|. Show that the number of 
elements in G of order p is congruent to —1 (mod p). 


7-90. (a) Compute (2238) mod 7. (b) Compute (513) mod 10. 


7-91. Prove Corollary 7.119 without using Lucas’s Congruence, by counting powers of p in 
the numerator and denominator of Ge) = (p*b)(p%b — 1)--- (p%b — p* + 1)/(p?)!. 


7-92. The Class Equation. Let G be a finite group with center Z(G) (see Example 7.106), 
and let 41,...,7%% € G be such that each conjugacy class of G of size greater than 1 contains 
exactly one x;. Prove that |G| = |Z(G)| + = 4 [G : Ce(x;)], where each term in the sum is 
a divisor of |G| greater than 1. 

7-93. A p-group is a finite group of size p© for some e > 1. Prove that every p-group G has 
|Z(G)| > 1. 

7-94. Wilson’s Theorem. Use group actions to prove that if an integer p > 1 is prime, 
then (p — 1)! = —1 (mod p). Is the converse true? 

7-95. How many ways are there to color an n x n chessboard with q possible colors if: 
(a) no symmetries are allowed; (b) rotations of a given board are considered equivalent; 
(c) rotations and reflections of a given board are considered equivalent? 

7-96. Consider an mx n chessboard where m ¥ n. (a) Describe all symmetries of this board 
(rotations and reflections). (b) How many ways can we color such a board with q possible 
colors taking symmetries into account? 

7-97. How many n-letter words can be made using a k-letter alphabet if we identify each 
word with its reversal? 


7-98. Consider necklaces that can use q kinds of gemstones, where rotations and reflections 
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of a given necklace are considered equivalent. How many such necklaces are there with: 

(a) eight gems; (b) nine gems; (c) n gems? 

7-99. Taking rotational symmetries into account, how many ways can we color the vertices 
of a regular tetrahedron with 7 available colors? 

7-100. Taking rotational symmetries into account, how many ways can we color the vertices 
of a cube with 8 available colors? 

7-101. Taking rotational symmetries into account, how many ways can we color the faces 
of a cube with gq available colors? 


7-102. Taking rotational symmetries into account, how many ways can we color the edges 
of a cube with gq available colors? 

7-103. Taking all symmetries into account, how many ways are there to color the vertices 
of the cycle C3 with three distinct colors chosen from a set of five colors? 

7-104. Taking all symmetries into account, how many ways are there to color the vertices 
of the cycle Cg so that three vertices are blue, two are red, and one is yellow? 


7-105. Taking rotational symmetries into account, how many ways are there to color the 
vertices of a regular tetrahedron so that: (a) two are blue and two are red; (b) one is red, 
one is blue, one is green, and one is yellow? 


7-106. Taking rotational symmetries into account, how many ways are there to color the 
vertices of a cube so that four are blue, two are red, and two are green? 


7-107. Taking rotational symmetries into account, how many ways are there to color the 
faces of a cube so that: (a) three are red, two are blue, and one is green; (b) two are red, 
two are blue, one is green, and one is yellow? 


7-108. Taking rotational symmetries into account, how many ways are there to color the 
edges of a cube so that four are red, four are blue, and four are yellow? 


7-109. How many ways can we color a 4x4 chessboard with five colors (identifying rotations 
of a given board) if each color must be used at least once? 


7-110. How many ways can we build an eight-gem necklace using five kinds of gems (iden- 
tifying rotations and reflections of a given necklace) if each type of gem must be used at 
least once? 
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Notes 


For a more detailed development of group theory, we recommend the excellent book by 
Rotman [114]. More information on groups, rings, and fields may be found in textbooks 
on abstract algebra such as [26, 65, 66]. The proof of Cauchy’s Theorem in 7.117 is due to 
James McKay [86]. The proof of Lucas’s Congruence in 7.118 is due to Sagan [116]. The 
proof of Sylow’s Theorem in 7.120 is often attributed to Wielandt [131], although G. A. 
Miller [87] gave a proof along similar lines over 40 years earlier. Proofs of Fermat’s Little 
Theorem and Wilson’s Theorem using group actions were given by Peterson [98]. 
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Permutation Statistics and g-Analogues 


In Chapter 1, we used the Sum Rule and the Product Rule to count permutations, lattice 
paths, anagrams, and many other collections of combinatorial objects. We found that the 
factorial n! counts permutations of an n-element set; the binomial coefficient @) counts the 
number of lattice paths from (0,0) to (k,n —k); and the multinomial coefficient ("1*""7"*) 
counts anagrams that contain n; copies of letter a; for 1 <i<k. 

In this chapter, we generalize these results by introducing various weight functions (called 
statistics) on permutations, lattice paths, anagrams, and other combinatorial objects. For 
example, we could weight all permutations w € S;, by the inversion statistic inv(w), which 
was defined in §7.4. Letting q denote a formal variable, we then study the polynomial 
ae Sn g@v(), which is a sum of n! monomials in the variable gq. This polynomial is called 
a q-analogue of n!, since setting g = 1 in the polynomial produces n!. This construction is 
a special case of the generating function of a weighted set, which we studied in Chapter 5. 
However, all weighted sets considered here will be finite, so that the associated generating 
functions are polynomials rather than formal power series. 

This chapter studies several permutation statistics (including inversions, descents, and 
major index) and their generalizations to statistics on words and lattice paths. We develop 
algebraic and combinatorial formulas for g-factorials, g-binomial coefficients, g-multinomial 
coefficients, g-Catalan numbers, and q-Stirling numbers. We also encounter some remarkable 
bijections proving that two different statistics on a given set of objects have the same 
generating function. 


SS] 


8.1 Statistics on Finite Sets 


Generating functions for arbitrary weighted sets have already been introduced in Chapter 5. 
To keep this chapter self-contained, we begin with a short summary of the relevant concepts 
in the special case of finite weighted sets. 


8.1. Definition: Generating Function of a Finite Weighted Set. Given a finite set 
S, a statistic on S is any function wt : S — Zo. We call the pair (S, wt) a weighted set. 
The generating function for such a set is the polynomial 


Gs.we(q) = > gre, 


zeES 


where q is a formal variable. 

8.2. Example. Suppose S = {a,b,c,d,e, f}, and wt : 5 — Zso is given by 
wt(a) = 4, wt(b) = 1, wt(c) =0, wt(d) = 4, wt(e) =4, wt(f) =1. 

The generating function for (S, wt) is 


Gswe(q) = qt) 4+ grt) 4... 4 gt) =gte gt Ptattat+g =14+2¢4+3¢'. 
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Define another statistic w : S — Zso by setting w(a) = 0, w(b) = 1, w(c) = 2, w(d) = 3, 
w(e) = 4, and w(f) = 5. This new statistic leads to a different generating function, namely 
Gsw(q)=1+qt+P?+@taet+@. 
8.3. Example. Suppose S is the set of all subsets of {1,2,3}, so 
S = {0, {1}; {2}, {3}, ae 2}, {15 3}, {2, 3}, {1, 2, 3h}. 
Define three statistics on subsets A € S as follows: 


w (A) = |A; w2(A) = Ba w3(A) = min{i:i¢ A} for AAO; w3(0) =0. 
ic A 


Each statistic produces its own generating function: 


Gowi(@ = P+teh¢ +P + PtP t+ PG =143q¢4+37 + = (14+); 
Gsui(q) = P+ +P+ Pt Pt Pt P+ P=ltq+ +27 4+ H+ +4; 
Gsus(Q) = P@t+a4+@P+Pt+rtgteP+g =14+4¢+27? +¢°. 


8.4. Example. For each integer n > 0, we have introduced the notation [n] for the set 
{0,1,2,...,n—1}. Define a weight function on this set by letting wt(i) = 7 for all i € [rn]. 
The associated generating function is 

q’ 1 

q=1- 


GinwilM =C+tC 4¢Pt--4qr t= 


The last equality can be verified by using the distributive law to calculate 
(q—-NA+qte?t.--+q" 4) =Qgr-1. 


The generating function in this example will be a recurring building block in our later work, 
so we give it a special name. 


8.5. Definition: q-Integers. If n is a positive integer and q is any variable, define the 
q-integer 
m1 _ q” = 


=1 Canaee 
[Mg =1t+aqtg+---+4 aa 


which is a polynomial in q. Also set [0], = 0. 


8.6. Example. Let S be the set of all lattice paths from (0,0) to (2,3). For P € S, let 
w(P) be the number of unit squares in the region bounded by P, the x-axis, and the line 
x = 2. Let w’(P) be the number of unit squares in the region bounded by P, the y-axis, 
and the line y = 3. By examining the paths in Figure 1.2, we compute 


Gsw(q) = P+P+G@+et+Pt+P+Ptertaer+ed 
= 1ltq+2¢7?+2q? + 2g +G+¢°; 

Gsw(g) = P+ 4+P4+P74+ Pt P+ P+ C++ 
= 1lt+q+2P? +2? +27 +g +¢°. 


Although the two weight functions are not equal (since there are paths P with w(P) # 
w’(P)), it happens that G's,u4(¢) = Gs.w(q) in this example. 

Next, consider the set T of Dyck paths from (0,0) to (3,3). For P € T, let wt(P) be 
the number of complete unit squares located between P and the diagonal line y = x. Using 
Figure 1.3, we find that 


Grot(Q=P+Ptq ++ =1424+ 7 +¢. 
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For any finite weighted set ($,wt), we know Ggwi(g) = veg ¢*™. By collecting 
together equal powers of q (as done in the calculations above), we can write this polynomial 
in the standard form G'g,we(q) = aog® + aig’ + a2q? +--+ Gmq™, where each a; € Zso. 
Comparing the two formulas for the generating function, we see that the coefficient a; of q' 
in Ggwt(q) is the number of objects z in S such that wt(z) = 7. The next examples illustrate 
this observation. 


8.7. Example. Suppose T is the set of all set partitions of an n-element set, and the weight 
of a partition is the number of blocks in the partition. By definition of the Stirling number 
of the second kind (see §2.12), we have Gr(q) = Cy_» S(n, k)q*. Similarly, if U is the set 
of all permutations of n elements, weighted by the number of cycles in the disjoint cycle 
decomposition, then Gu (q) = y_9 8’(n, k)q*, where s’(n,k) is a signless Stirling number 
of the first kind (see §3.6). Finally, if V is the set of all integer partitions of n, weighted by 
number of parts, then Gy (q) = 779 P(n, k)q®. This is also the generating function for V 
if we weight a partition by the length of its largest part (see §2.11). 


8.8. Remark. Suppose we replace the variable g in Gg(q) by the value 1. We obtain 
Gs(1) = Dies 1 = Veg 1 = |S|. For instance, in Example 8.6, Gr.wt(1) = 5 = C3. In 
Example 8.7, Gr(1) = B(n) (the Bell number), Gy(1) = n!, and Gy (1) = p(n). Thus, the 
generating function Gs(q) can be viewed as a weighted analogue of the number of elements 
in S. This is the origin of the term “q-analogue.” On the other hand, using the convention 
that 0° = 1, Gg(0) is the number of objects in S having weight zero. 

We also note that the polynomial Gs(q) can sometimes be factored or otherwise sim- 
plified, as illustrated by the first weight function in Example 8.3. Different statistics on S$ 
often lead to different generating functions, but this is not always true (see Examples 8.3 
and 8.6). 


DS 


8.2 Counting Rules for Finite Weighted Sets 


This section reviews the counting rules for weighted sets, which were given earlier in §5.8. 
We also give proofs of the Sum Rule and Product Rule for finite weighted sets that are a 
bit simpler than the proofs given earlier for the general case. 


8.9. The Sum Rule for Finite Weighted Sets. Suppose S is a finite weighted set that 
is the disjoint union of k weighted sets $1, 52,...,5,. Assume wts,(u) = wts(u) whenever 
1<i<kandueS;. Then 


Gs(q) = Gs, (4) + Gs, (q) +--+ + Gs, (q). 


8.10. The Product Rule for Finite Weighted Sets. Suppose &k is a fixed positive 
integer and S),..., 5, are finite weighted sets. Suppose S$ is a weighted set such that every 
u € S can be constructed in exactly one way by choosing u; € 5, choosing uz € So, 
and so on, and then assembling the chosen objects u,,...,uxg in a prescribed manner. 
Assume that whenever wu is constructed from uy, U2,...,Uz, the weight-additivity condition 
wts(u) = wtgs, (ui) + wts,(u2) +--+ wtgs, (ux) holds. Then 


Gs(q) — Gs, (q) : Gs, (q) S rehadee © Go, (q). 


8.11. The Bijection Rule for Finite Weighted Sets. Suppose S and T are finite 
weighted sets and f : S > T is a weight-preserving bijection, meaning that wtr(f(u)) = 
wtgs(u) for all u€ S. Then Gs(q) = Gr(q). 
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We prove the Sum Rule for Finite Weighted Sets as follows. Assume the setup in that 
rule. By definition, Gs,wi(@) = ze q’*), Because addition of polynomials is commutative 
and associative, we can order the terms of this sum so that the objects in S; come first, 
followed by the objects in S2, and so on, ending with the objects in S;. No term is duplicated 
in the resulting list, since the sets S; are pairwise disjoint. We now compute 


Gg.wt(@) = p> grit) 4 S- grt Cees .. gre 


ZES, ZzES2 zES, 
= 2 gu ae S- gvt2) ches ees » qv te() 
z€S1 ZzES2 ZzESz 


= G's, wt; (q) - Ggo,wte (q) a ea Gg, wt, (q). 


Turning to the proof of the Product Rule for Finite Weighted Sets, first consider the 
special case where S = A x B and wtgs(a,b) = wta(a) + wtg(b) for alla € A and be B. 
Using the distributive law, associative law, and commutative law to simplify finite sums, 
we compute: 


Gs(q) = > gvts() i > gr) = 2 be quale twte®) = S- gr ta(® gta (®) 


zeS (a,b)EAxXB ac€AbeB a€AbeB 
= s (me a zn) = (= gone) : (= zn) = Ga4(q)- Ga(q). 
acA beB acA beB 


It follows by induction on k that the set S’ = S,; x Sg x--- x S_; with weight function 
wtg(S1,-.-, 5k) = y wts,(s;) has generating function G's’(q) = nea Gs, (q). Finally, 
the hypothesis of the Product Rule 8.10 says that there is a weight-preserving bijection 
from S’ to S. Thus the Product Rule is a consequence of the preceding remarks and the 
Bijection Rule. 

Later, we need the following generalization of the Bijection Rule for Weighted Sets; we 
leave the proof as an exercise. 


8.12. The Weight-Shifting Rule. Suppose (S,wgs) and (T, wr) are finite weighted sets, 
f :S —> T isa bijection, and 6 is a fixed integer such that for all z € S, ws(z) = wr(f(z)) +0. 
Then Gs,ws(q) = VGrwr(Q)- 


DT 


8.3. Inversions 


The inversion statistic for permutations was introduced in §7.4. We now define inversions 
for general words. 


8.13. Definition: Inversions. Suppose w = w ,w2--: Wr is a word, where each letter w; is 
an integer. An inversion of w is a pair of indices i < j such that w; > w;. We write inv(w) 
for the number of inversions of w. Also let Inv(w) be the set of all inversion pairs (i, 7). 


Thus, inv(w) = | Inv(w)| counts pairs of letters in w (not necessarily adjacent) that are 
out of numerical order. If S is any finite set of words using the alphabet Z, then 


Gs inv(q@) = = eo) 
wes 


is the inversion generating function for S. These definitions extend to words using any 
totally ordered alphabet. 
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8.14. Example. Consider the word w = 414253; here w,; = 4, wg = 1, w3 = 4, etc. The 
pair (1,2) is an inversion of w since wy = 4 > 1 = wo. The pair (2,3) is not an inversion, 
since w2 = 1 < 4 = w3. Similarly, (1,3) is not an inversion. Continuing in this way, we find 
that Inv(w) = {(1, 2), (1, 4), (1, 6), (3, 4), (8, 6), (5,6)}, so inv(w) = 6. 


8.15. Example. Let S be the set of all permutations of {1,2,3}. We know that 
S = {123, 132, 213, 231, 312, 321}. 
Counting inversions, we conclude that 
Gsin(g)=P +g ta +P t+a+g =14 294297? +9 =11+a)(1+9+¢”). 


Note that Gg(1) = 6 = 3! = |S]. Similarly, if T is the set of all permutations of {1, 2,3, 4}, 
a longer calculation leads to 


Grinv(q) = 1+ 3q + 5q? + 69? +5q* + 3¢? +¢ =11t+glt¢q+@)(l+qt+74+¢). 
The factorization patterns in this example are explained and generalized in 88.4. 


8.16. Example. Let S = R(0713) be the set of all rearrangements of two 0’s and three 1’s. 
We know that 


S = {00111, 01011, 01101, 01110, 10011, 10101, 10110, 11001, 11010, 11100}. 
Counting inversions, we conclude that 
Gsin(Q =P+C4+P74+ P+ P+ P+Pf+P+ P+ =14+q+27 +27 +27 4+ 4+¢. 


8.17. Example. Let S = R(a'b'c?), where we use a < b < ¢ as the ordering of the 
alphabet. We know that 


S = {abcc, acbc, accb, bacc, bcac, beca, cabc, cacb, cbac, cbca, ccab, ccba}. 
Counting inversions leads to 
Gginv(q) = 1 + 2q + 3g? + 3q° + 2q* + q°. 


Now let T = R(a!bc!) and U = R(a?b'c!) with the same ordering of the alphabet. One 
may check that G's inv(¢) = Grinv(¢g) = Guinv(q), although the sets of words in question 
are all different. This phenomenon will be explained in §8.9. 


8.18. Remark. It can be shown that for any word w, inv(w) is the minimum number of 
transpositions of adjacent letters required to sort the letters of w into weakly increasing 
order (Theorem 7.29 proves a special case of this result). 
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8.4 q-Factorials and Inversions 


This section studies the generating functions for sets of permutations weighted by inversions. 
The answer turns out to be the following g-analogue of n!. 
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8.19. Definition: g-Factorials. Given n € Zs and a formal variable q, define the q- 
analogue of n! to be the polynomial 


iol = [lle = [Ja ato? + ea) =P. 


i=1 i=l 


Also define [0]!, = 1. 
The q-factorials satisfy the recursion [n]!, = [n — 1]!q[n]q for alln > 1. 


8.20. Example. We have [0]!, = 1, [1]!, = 1, [2]!, = [2], =1+4, 


[3]!g = (1+ 9)(1 +9447) =14+2¢4+ 27 +¢, 
[4g = (+ g)(1t¢q+@)14+qt+7 +@) =143¢+5¢? + 69? + 5q* + 39° + 4°, 
[B]!g = 1+ 4g + 997 + 159? + 209* + 229° + 209° + 15q’ + 99° + 49° + 2°. 


We can use other variables besides gq; for instance, [3]!, = 1+ 2¢+ 2t? + t?. Occasionally we 
replace the variable by a specific integer or real number; then the q-factorial evaluates to 
some specific number. For example, when q = 4, [3]!, = 1+8+32+4 64 = 105. As another 
example, when t = 1, [n]!, = n!. 


Now we prove that [n]!, is the generating function for permutations of n symbols 
weighted by inversions. 


8.21. Theorem: q-Factorials and Inversions. For every n € Zso, let S;, be the set of 


all permutations of {1,2,...,n}, weighted by inversions. Then 
Gs,,inv(q) = Dp PO) = [rly (8.1) 
weSn 


Proof. We use induction on n. When n = 0 or n = 1, both sides of (8.1) are 1. Now, 
fix n > 2, and assume we already know that docs, ginv(w') = [n — 1]!q. To prove the 
corresponding result for n, we define a weight-preserving bijection F': S$, + [n] x Sp—1, 
where the weight of (k,w’) € [n] <x Sn-1 is k + inv(w’). By Example 8.4, the induction 
hypothesis, and the Product Rule for Weighted Sets, the codomain of F' has generating 
function [n]q - [nm — 1]!q = [n]!,. So the needed result will follow from the Bijection Rule for 
Weighted Sets. 

Given a permutation w = wiw2:::Wn € Sn, we need to define F(w) = (k,w’) € 
[7] x S,-1 in such a way that inv(w) = k + inv(w’). One way to pass from w € S,, to 
w’' € S,_1 is to erase the unique occurrence of the symbol n in the word w. To preserve 
weights, we are forced to define k = inv(w) — inv(w’). To see that the map sending w to 
(k, w’) is invertible, we describe k in another way. Suppose w; = n. Since n is larger than 
every other symbol in w, we see that (i, 7) is an inversion pair of w for i < 7 <n, whereas 
(r, 7) is not an inversion pair of w for 1 <r < i. All other inversion pairs of w do not involve 
the symbol n; these inversion pairs correspond bijectively to the inversion pairs of w’. Thus, 
inv(w) = inv(w’) +n — i, where k = n —i is the number of symbols to the right of n in 
w. It follows that the two-sided inverse of F' sends (k, w’) € [n] x S;,-1 to the permutation 
obtained from the word w’ by inserting the symbol n so that n is followed by k symbols. 
The formula & = n —i also shows that & € {0,1,2,...,2.— 1} = [n], so that F does map 
into the required codomain. O 


8.22. Example. For n = 6, we compute F'(351642) = (2, 35142). Observe that 


Inv(351642) = {(1,3), (1,6), (2, 3), (2, 5), (2, 6), (4, 5), (4, 6), (5, 6)}, 
Inv(35142) = {(1,3), (1,5), (2,3), (2, 4), (2,5), (4, 5)}, 
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and inv(351642) = 8 = 2+ 6 = 2 + inv(35142). To pass from Inv(351642) to Inv(35142), 
we delete the two inversion pairs (4,5) and (4,6) involving the symbol w4 = 6 = n, and 
then renumber positions in the remaining inversion pairs to account for the erasure of the 
symbol in position 4. 

Next we compute F'~!(k, 25143) for k = 0,1,2,3,4,5. The answers are 


251436, 251463, 251643, 256143, 265143, 625143. 


The inversion counts for these permutations are 5, 6, 7, 8, 9, and 10, respectively. As the 
new largest symbol moves through the permutation w’ from right to left, one symbol at a 
time, the total inversion count increases by 1 with each step. 


We can iterate the construction in the proof of Theorem 8.21, removing n — 1 from w’ 
to go from the set [n] < Sp—1 to [n] x [n — 1] x Sp_2, and so on. Ultimately, we arrive at 
a weight-preserving bijection G : S,, — [n] x [n — 1] x --- x [2] x [1] that maps w € S;, to 
an n-tuple G(w) = (kn, kn—1,.--,k1) € [n] x [n — 1] x --- x [1]. Here, ky, is the number of 
symbols to the right of n in w; ky, is the number of symbols to the right of n — 1 in w 
with n erased; k,—2 is the number of symbols to the right of n — 2 in w with n and n — 1 
erased; and so on. More succinctly, for 1 < r <n, ky ts the number of symbols to the right 
of r in w that are less than r. In other words, 


kp ={(i,9):1<i<j<nandr=w,>wj} forl<r<n. 


Thus, k, counts the number of inversion pairs in Inv(w) such that the left symbol in the 
pair is r. The sequence G(w) = (kn,kn—1,-..,k1) is called an inversion table for w. We 
can reconstruct the permutation w from its inversion table by repeatedly applying the map 
F~' from the proof of Theorem 8.21. Starting with the empty word, we insert symbols 
r = 1,2,...,n in this order. We insert symbol r to the left of k, symbols in the current 
word. 


8.23. Example. The inversion table for w = 42851673 € Sg is G(w) = (5,1, 1, 2,3,0,1, 0). 
We have ks; = 2, for instance, because of the two inversion pairs (4,5) and (4,8) caused by 
the symbols 5 > 1 and 5 > 3 in w. Now let us compute G~!(5,5,1,2,3,1,0,0). We begin 
with the empty word, insert 1, then insert 2 at the right end to obtain 12, since kg = ky = 0. 
Since k3 = 1, we insert 3 to the left of the 2 to get 132. Since k4 = 3, we insert 4 to the left 
of all 3 existing symbols to get 4132. The process continues, leading to 41532, then 415362, 
then 4715362, and finally to 47815362. Having found this answer, one may quickly check 
that G(47815362) = (5,5,1,2,3,1,0,0), as needed. 


Other types of inversion tables for permutations can be constructed by classifying the 
inversions of w in different ways. Our discussion above classified inversions by the value of 
the leftmost symbol in the inversion pair. One can also classify inversions using the value of 
the rightmost symbol, the position of the leftmost symbol, or the position of the rightmost 
symbol. These possibilities are explored in the exercises. 
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8.5 Descents and Major Index 


This section introduces more statistics on words, which leads to another combinatorial 
interpretation for the q-factorial [n]!q. 
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8.24. Definition: Descents and Major Index. Let w = w,w2---w, be a word where 
each w; comes from some totally ordered alphabet. The descent set of w, denoted Des(w), 
is the set of all i < n such that w; > wi41. The descent statistic of w is des(w) = | Des(w)|. 
The major index of w, denoted maj(w), is the sum of the elements of the set Des(w). 


Thus, Des(w) is the set of positions in w where a letter is immediately followed by 
a smaller letter; des(w) is the number of such positions; and maj(w) is the sum of these 
positions. 


8.25. Example. If w = 47815362, then Des(w) = {3,5,7}, des(w) = 3, and maj(w) = 
34+5+7=15. If w = 101100101, then Des(w) = {1,4, 7}, des(w) = 3, and maj(w) = 12. 
If w = 33555789, then Des(w) = 0, des(w) = 0, and maj(w) = 0. 


8.26. Theorem: q-Factorials and Major Index. For all n € Zo, let S;, be the set of 
all permutations of {1,2,...,n}, weighted by major index. Then 


Cinna > eS lal (8.2) 


weSn 


Proof. We imitate the proof of Theorem 8.21. Both sides of (8.2) are 1 when n = 0 or 


n = 1. Proceeding by induction, fix n > 2, and assume Di wes , quai) = In - Aly. 
To prove the corresponding result for n, it suffices to define a weight-preserving bijection 
F : S, — [n] x Sy-1, since [n],-[n — 1]!, = [n]!q. If F(w) = (k,w’), we must have 


maj(w) = k + maj(w’) to preserve weights. 

As before, we can map w € S, to w’ € S,-1 by erasing the unique occurrence of n in 
the word w = w1w2-:: Wp. To preserve weights, we are forced to define F(w) = (maj(w) — 
maj(w’), w’). However, it is not obvious that the map F' defined in this way is invertible. To 
check this, we need to know that w can be recovered from w’ and k = maj(w) — maj(w’). 
Given any w’ € S,_1, there are n positions between the n — 1 symbols of w’ (including the 
far left and far right positions) where we might insert n to get a permutation in S,,. Imagine 
inserting n into each of these n positions and seeing how maj(w’) changes. If we can show 
that the changes in major index are the numbers 0,1,2,...,n — 1 (in some order), then 
the invertibility of F will follow. This argument also proves that / = maj(w) — maj(w’) € 
{0,1,...,2 — 1} = [nm], so that F does map into the required codomain. We prove the 
required fact in Lemma 8.28 below, after considering an example. O 


8.27. Example. Let n = 8 and w’ = 4251673 € $7, so that maj(w’) = 1+3+46 = 10. 
There are eight gaps in w’ where the new symbol 8 might be placed. We compute the major 
index of each of the resulting permutations w € Sg: 


~ 


=142444+7 =14 =maj 


maj(8 >4>2<5>1<6<7>3) 
maj(4<8>2<5>1<6<7>3) =2+44+4+7 =13 =maj 
maj(4>2<8>5>1<6<7>3) = 34447 =15 =maj 
maj(4>2<5<8>1<6<7>3) =144+4+7 =12 =maj 
maj(4>2<5>1<8>6<7>3) =14+34+5+4+7 =16 =maj 

( ) 

( ) 

( ) 


SON 


~ 


maj(44>2<5>1<6<8>7>3) =1+3+4+64+7 =17 =maj 
maj(4>2<5>1<6<7<8>3 
maj(44>2<5>1<6<7>3<8 


~ 


~ 
Ria aN LP AP NP FLFR 


+347 =11 maj 
+3+6 =10 =maj 


+++4+4+4+4++4+ 
SCRPANON TWEE 


SESSESEEE 


~ 


We see that the possible values of & = maj(w) — maj(w’) are 4,3,5,2,6,7,1,0, which form 
a rearrangement of 0,1, 2,3, 4,5, 6, 7. 


The next lemma explains precisely how the major index changes when we insert the 
symbol n into a permutation of {1,2,...,n— 1}. 
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8.28. Lemma. Suppose v = v1 V2°--Un—1 € Sn—1 has descents at positions 7; > i2 >--- > 
ta. Number the n gaps between the n — 1 symbols in v as follows. The gap to the right 
of Vp—1 is numbered 0. For 1 < j < d, the gap between v;, and vj,;41 is numbered j. The 
remaining gaps are numbered d+1,d+2,...,n—1, starting at the gap left of v; and working 
left to right. For all 7 in the range 0 < 7 < n, if w is obtained from v by inserting n in the 
gap numbered j, then maj(w) = maj(v) + j. 


Proof. As an example of how gaps are numbered, consider v = 4251673 from the example 

above. The gaps in v are numbered as follows: 
-4>2< 5 > 1 < 6 < T > 38. 
4 3 5 2 6 7 1 0 

The calculations in the example show that inserting the symbol 8 into the gap numbered j 

causes the major index to increase by 7, as predicted by the lemma. 

Now we analyze the general case. If we insert n into the far right gap of v (which is 
numbered 0), there are no new descents, so maj(w) = maj(v) + 0 as needed. Next suppose 
we insert n into the gap numbered j, where 1 < j < d. We had uj, > v;,41 in v, but the 
insertion of n changes this configuration to wij, < wij;41 = n > Wi;+2. This pushes the 
descent in v at position 7; one position to the right. Furthermore, the descents that occur 
in v at positions i;_1,...,i1 (which are to the right of 7;) also get pushed one position to 
the right in w because of the new symbol n. It follows that the major index increases by 
exactly 7, as needed. 

Finally, suppose d < 7 < n—1. Let the gap numbered j occur at position u in w, and let 
t be the number of descents in v preceding this gap. By definition of the gap labeling, we 
must have j = (u—t)+d. On the other hand, inserting n in this gap produces a new descent 
in w at position u, and pushes the (d — t) descents in v located to the right of position u 
one position further right. The net change in the major index is therefore u + (d—t) = J, 
as needed. O 


As in the case of inversions, we can iterate the proof of Theorem 8.26 to obtain a 
weight-preserving bijection G : (S;,,maj) > [n] x [n —1] x --- x [1]. For w € S), define 
w to be w with all symbols greater than r erased. We have G(w) = (kn, kn—1,---,k1), 
where k, = maj(w‘")) — maj(w°—)) for 1 <r <n. The list (kn,...,k1) is a major index 
table for w. We can recover w from its major index table by inserting symbols 1,2,...,n, 
in this order, into an initially empty word. We insert symbol r into the unique position 
that increases the major index of the current word by k,. This position can be found by 
numbering gaps as described in Lemma 8.28. 


8.29. Example. Given w = 42851673 € Sg, we compute maj(w) = 15, maj(4251673) = 10, 
maj(425163) = 9, maj(42513) = 4, maj(4213) = 3, maj(213) = 1, maj(21) = 1, maj(1) = 
0, and maj(e) = 0, where e denotes the empty word. So the major index table of w is 
(5, 1,5, 1,2, 0, 1,0). 

Next, taking n = 6, we find G~1(3,4,0,1,1,0). Using the insertion procedure from 
Lemma 8.28, we generate the following sequence of permutations: 


1, 21, 231, 2314, 23154, 623154. 
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8.6 q-Binomial Coefficients 


The formula [n]!4 = [[j_, [é]q for the g-factorial is analogous to the formula n! = JJ}, é for 
the ordinary factorial. We can extend this analogy to binomial coefficients and multinomial 
coefficients. This leads to the following definitions. 


8.30. Definition: g-Binomial Coefficients. Given k,n € Zso with O < k <n anda 
variable q, define the q-binomial coefficient 


H ee (= Digt 1) +(@= 1) 
Whe Wellgl— Ble “tg? = 1) Gt 1) aig 1)ge = Digest 1) es (g— 1) 
8.31. Definition: g-Multinomial Coefficients. Given n1,...,n~ € Z>o and a variable 
q, define the q-multinomial coefficient 
eos - [ny +--+ ne]!q _ (qritetme — 1)... (9g? —1)(q—1) 
M1, +++) Nk [na]!g[na]!q-+- [rela = TPR, [(g™ — 1)(gri-? —1)---(q-1)] 


It is not immediately evident from the defining formulas that g-binomial coefficients and 
q-multinomial coefficients really are polynomials in q (rather than ratios of polynomials). 
Below, we prove the stronger fact that these objects are polynomials with nonnegative inte- 
ger coefficients. We also give several combinatorial interpretations for g-binomial coefficients 
and qg-multinomial coefficients as generating functions for weighted sets. Before doing so, 
we need to develop a few more tools. 

By consulting the definitions, we see at once that [7] re [ey 
q-binomial coefficients are special cases of g-multinomial coefficients. We often prefer to 
use multinomial coefficients, writing eae rather than Pils or lege because in most 


\F = Leal . In particular, 


a,b a 
combinatorial settings the parameters a and b are more natural than a and a+ b. 
Before describing the combinatorics of g-binomial coefficients, we give an algebraic proof 
of two recursions satisfied by the g-binomial coefficients, which are g-analogues of Pascal’s 
Recursion 2.3 for ordinary binomial coefficients. 


8.32. Theorem: Recursions for g-Binomial Coefficients. For all a,b € Zyo, 
a+b —  »4fatb-1 + a+b-—1] | 
a,b | ee 1,b a,b—1 }’ 
q q q 
a+b 7 a+b-1 a|a+b-1 
a,b 7 a—1,b va a,b-—1 ] 
q q q 


b 
0,6 


The initial conditions are lao , = | I; =1 for all a,b € Zo. 


Proof. We prove the first equality, leaving the second one as an exercise. Writing out the 
definitions, the right side of the first recursion is 


Jer | [a+b—1]!, 
a—1,b J, Lab-1L jp fa—iglbl, © [allglb- 1h 


Multiply the first fraction by [a],/[a], and the second fraction by [b],/[b], to create a common 
denominator [a]!,[6]!,. Bringing out common factors, we obtain 


(ori 


* ata ‘ 
Sat) «lala + We 
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By definition of g-integers, 


Ctqte +e te a Aetoa eg?) 
(Lte ta) + (taht? +... 492) = [at dq. 


[bla + 9° lala 


Putting this into the previous formula, we get 


ee fa 
ale = | ab J,” 


The initial conditions follow immediately from the definitions. O 


8.33. Corollary: Polynomiality of g-Binomial Coefficients. For all k,n € Zso with 
O<k<n, i P is a polynomial in q with nonnegative integer coefficients. 

Proof. Use induction on n > 0. The base case (n = 0) holds because [ol =1.Fixn>0, 
n—1 
J 
j in the range 0 < 7 < n—1. Then, by the first recursion above (with a replaced by & and 


b replaced by n — k), 
n| — np l|n—-l n-1 
kK] ~% |e-a] TL k& I 
q q q 


By the induction hypothesis, each g-binomial coefficient on the right side is a polynomial 


and assume that [ | is already known to be a polynomial with coefficients in Zso for all 
; 2 


with coefficients in Z>0, so i is also such a polynomial. O 


qd 


DS 


8.7 Combinatorial Interpretations of g-Binomial Coefficients 


We now show that q-binomial coefficients count various weighted sets of anagrams, lattice 
paths, and integer partitions. 


8.34. Theorem: Combinatorial Interpretations of g-Binomial Coefficients. Fix 
integers a,b > 0. Let R(071°) be the set of anagrams consisting of a zeroes and b ones. Let 
L(a,b) be the set of all lattice paths from (0,0) to (a,b). For 7 € L(a,b), let area(m) be 
the area of the region between 7 and the x-axis, and let area’(z) be the area of the region 
between 7 and the y-axis. Let P(a,b) be the set of integer partitions jv with largest part at 
most a and with at most b parts. Then 


ba = S- gure) = » go) = Pa gue -_ x git, 
? qd ) 


wER(071") wEL(a,b mEL (a,b) pe P(a,b) 


Proof. Step 1. For a,b € Zso, define g(a,b) = Spain b) gatea(™) — Giga arcal@)> We 
show that this function satisfies the same recursion and initial conditions as the g-binomial 
coefficients, namely 


g(a, b) = q’g(a—1,b) + g(a,b—1) for a,b >0; g(a,0) = g(0,b) = 1 for a,b > 0. 


It then follows by a routine induction on a+ b that Cab lg = g(a, b) for all a,b € Zso. 


To check the initial conditions, note that there is only one lattice path from (0,0) to 
(a,0), and the area underneath this path is zero. So g(a,0) = q° = 1. Similarly, g(0,b) = 1. 
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Now we prove the recursion for g(a,b), for fixed a,b > 0. The set L(a,b) is the disjoint 
union of sets Ly; and Lz, where L, consists of all paths from (0,0) to (a,b) ending in an 
east step, and LD» consists of all paths from (0,0) to (a,b) ending in a north step. See 
Figure 8.1. Deleting the final north step from a path in L» defines a bijection from Le 
to L(a,b — 1), which is weight-preserving since the area below the path is not affected by 
the deletion of the north step. By the Bijection Rule for Weighted Sets, }) cr, gatea(™) — 
ei b—1) gte(™ = g(a, b—1). On the other hand, deleting the final east step from a path 
in Ly defines a bijection from L; to L(a—1,b) that is not weight-preserving. The reason is 
that the b area cells below the final east step in a path in LZ; no longer contribute to the area 
of the path in L(a—1, b). However, since the area drops by b for all objects in £1, we conclude 
from the Weight-Shifting Rule that )? cr, geteal™) = gb ome te= ib g?tea(t) — g>g(a—1,b). 
By the Sum Rule for Weighted Sets, 


g(a, b) = Gras) (4) = G1,(q) + Gr. (gq) = 9’9(a — 1,6) + g(a,b - 1). 


We remark that a similar argument involving deletion of the initial step of a path in L(a, b) 
establishes the dual recursion g(a, b) = g(a — 1,b) + q“g(a,b— 1). 


delete last step 


es b 
delete last step 
b ———— b-1 
FIGURE 8.1 


Deleting the final step of a lattice path. 


Step 2. We define a weight-preserving bijection g : R(0%1°) + L(a,b), where R(071°) 
is weighted by inv and L(a,b) is weighted by area. For w € R(0°1"), g(w) is the path 
obtained by converting each 0 in w to an east step and each 1 in w to a north step. By 
examining a picture, one sees that inv(w) = area(g(w)) for all w € R(01°). For example, 
given w = 1001010, g(w) is the lattice path shown in Figure 8.2. The four area cells in the 
lowest row correspond to the inversions between the first symbol 1 in w and the four zeroes 
occurring later. The two area cells in the next lowest row come from the inversions between 
the second 1 in w and the two zeroes occurring later. And so on. By the Bijection Rule for 
Weighted Sets, G R001") ,inv (4) = Gite B),anea\@) = bale 

Step 3. We define a weight-preserving bijection I : L(a,b) + L(a,b), where the domain 
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w = 1001010 ——<——-> 


FIGURE 8.2 
A weight-preserving bijection from words to lattice paths. 


is weighted by area and the codomain is weighted by area’. Given 7 € L(a,b), I(m) is the 
path obtained by reversing the sequence of north and east steps in 7. Geometrically, I(7) 
is obtained by rotating the path 180 degrees through (a/2,b/2). One readily checks from a 
picture that area’(I(7)) = area(7). So Gr(a,b),area’(@) = G'x(a,b),area(Q) = eed . 

Step 4. Inspection of Figure 2.17 shows that the set P(a,b), weighted by wt() = |u| for 
uu € P(a,b), can be identified with the set L(a,b), weighted by area’. So <p (a0) git! = 


GL (a,b) area! (q) = eee q = 


Step 1 of the preceding proof used the recursion for g-binomial coefficients to connect the 
algebraic and combinatorial interpretations for these coefficients. Here is another approach 
to proving the theorem, based on clearing denominators in as . We show that 

ea 


[at b}lg = [a}tgbllg SO ge, (8.3) 


wER(071°) 


which reproves the first equality in Theorem 8.34. 

We know from Theorem 8.21 that the left side of (8.3) is the generating function for 
the set Sa4» of permutations of {1,2,...,a+ 6}, weighted by inversions. By the Product 
Rule for Weighted Sets, the right side is the generating function for the Cartesian product 
Sa X Sp x R(0°1°), with weight wt(u,v, w) = inv(u) + inv(v) + inv(w) for u € Sa, v € Sp, 
and w € R(071°). By the Bijection Rule for Weighted Sets, it suffices to define a bijection 


f Se * Sp X R01") S Sore 


such that inv(f(u, v, w)) = inv(u) + inv(v) + inv(w) for u € Sa, v € Sp, and w € R(0°1°). 

Given (u,v,w) in the domain of f, note that u is a permutation of the a symbols 
1,2,...,a. Replace the a zeroes in w with these a symbols, in the same order that they 
occur in u. Next, add a to each of the values in the permutation v. Then replace the b ones 
from the original word w by these new values in the same order that they occur in v. The 
resulting word z is evidently a permutation of {1,2,...,a+ b}. For example, if a = 3 and 

= 5, then 
f (132, 24531, 01100111) = 15732864. 


Since a and b are fixed and known, we can invert the action of f. Starting with a 
permutation z of {1,2,...,a+}, we first recover the word w € R(071°) by replacing the 


numbers 1,2,...,@ in z by zeroes and replacing the numbers a+ 1,...,a@-+ 0 in z by ones. 
Next, we take the subword of z consisting of the numbers 1, 2,...,a to recover u. Similarly, 
let v’ be the subword of z consisting of the numbers a + 1,...,a +b. We recover v by 


subtracting a from each of these numbers. This algorithm defines a two-sided inverse map 
to f. For example, still taking a = 3 and b = 5, we have 


f~ + (35162847) = (312, 23514, 01010111). 
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To finish, we check that f is weight-preserving. Fix u,v,w,z with z = f(u,v,w). Let A 
be the set of positions in z occupied by letters in u, and let B be the remaining positions 
(occupied by shifted letters of v). Equivalently, by definition of f, A = {i : w; = 0} and 
B= {i : w; = 1}. The inversions of z can be classified into three kinds. First, there are 
inversions (7,7) such that i, 7 € A. These inversions correspond bijectively to the inversions 
of u. Second, there are inversions (i,j) such that 7,7 € B. These inversions correspond 
bijectively to the inversions of v. Third, there are inversions (7,7) such that i € A and 
j € B, ori € Band j € A. The first case (¢ € A,j € B) cannot occur, because every 
position in A is filled with a lower number than every position in B. The second case 
(i € B,j € A) occurs iff i < 7 and w; = 1 and w; = 0. This means that the inversions of the 
third kind in z correspond bijectively to the inversions of the binary word w. Adding the 
three kinds of inversions, we conclude that inv(z) = inv(w) + inv(v) + inv(w), as needed. 


8.35. Remark. We have discussed several combinatorial interpretations of the g-binomial 
coefficients. These coefficients are also relevant to the study of linear algebra over finite 
fields. Specifically, let F' be a finite field with q elements, where (by a theorem of abstract 
algebra) q is necessarily a prime power. Then the g-binomial coefficient lal is an integer 
that counts the number of k-dimensional subspaces of the n-dimensional vector space F”. 
We prove this fact in 812.7. Incidentally, this is why the letter g appears so frequently as 
the variable in qg-analogues: in algebra, p is often used to denote a prime integer, and gq = p® 
is used to denote a prime power. 


DT 


8.8 q-Binomial Coefficient Identities 


Like ordinary binomial coefficients, the g-binomial coefficients satisfy many algebraic identi- 
ties, which often have combinatorial proofs. This section gives two examples of such identi- 
ties: a g-analogue of the Chu-Vandermonde identity 2.17, and a q-analogue of the Binomial 
Theorem 2.8. 


8.36. Theorem: g-Chu—Vandermonde Identity. For all integers a,b,c > 0, 
a+b+c+l1 => (b+1)(a—k) | K+6 a—k+e 
a,b+c+1 7 at k,b : a—k,c Z 


Proof. Recall the diagram we used to prove the original version of the identity, which is 
redrawn here as Figure 8.3. The path dissection in this diagram defines a bijection 


f:L(a,b+c+1)> LJ £8) x L(a— k,c). 
k=0 


Here, k is the x-coordinate where the given path in L(a, b+c+1) crosses the line y = b+(1/2). 
If a path P € L(a,b+c+1) maps to (Q, R) € L(k,b) x L(a—k,c) under f, then we see 
from the diagram that 

area(P) = area(Q) + area(R) + (6+ 1)(a—k). 


The term (b+ 1)(a — k) is the area of the lower-right rectangle of width a — k and height 
b+ 1. It now follows from the Weight-Shifting Rule, the Sum Rule, and the Product Rule 
for Weighted Sets that 


Gig piety areal @) = S- qe) Co ect) wteald) - Gite—we)aveal@) 
k=0 
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(a,b+c+1) 


(k,b+1) 


(0,0) x=k 


FIGURE 8.3 
Diagram used to prove the g-Chu Vandermonde identity. 


We complete the proof by using Theorem 8.34 to replace each area generating function by 
the appropriate g-binomial coefficient. O 


Next we prove a qg-analogue of the Binomial Theorem. 


8.37. The q-Binomial Theorem. For all variables x, y,q and n € Zso, 


ae n ; 
(a + qy)(x+ @y)(a + @y)--- (w@+ ary) = Do KEV? Hl hah, (8.4) 
k=0 q 
Proof. We first prove the special case of the theorem where x = 1. Fix n > 0, and let P be 
the set of integer partitions consisting of distinct parts chosen from {1,2,...,n}. On one 
hand, the two-variable version of the Product Rule for Weighted Sets shows that 
So lly’) = [J+ a'y) (8.5) 
uwEP i=1 


(this is a finite version of Theorem 5.48). Specifically, we build  € P by choosing to exclude 
or include a part of size 7, for each i between 1 and n. The generating function for the ith 
choice is g°y° +q’y! = 1+q'y, since the area of jz increases by i and the length of ju increases 
by 1 if the part ¢ is included, whereas area and length increase by 0 if part i is excluded. 
On the other hand, we can write P as the disjoint union of sets Po, P,,...,P,, where 
Ph = {we P: C(u) = k}. Recall from Theorem 8.34 that P(a,b) = {v € Par: 14 < 
a and f(v) < b}. Fix k, and define a map F': P, > P(n —k,k) by F(t, f2,---, Ue) = 
(141 —-k, f2—(k—-1),..., fe -1). One readily checks that F' does map into the stated codomain, 
and F is bijective with two-sided inverse F~!(1,...,Vx) = (41 +k, vo+(k-1),...,¥% +1). 
Here we use the convention 1; = 0 for &(v) < i < k. Looking only at area, applying the 
bijection F decreases the area of w by k + (K — 1) +---+1 = k(k+1)/2. On the other 
hand, every object in P;, has length k. By the Weight-Shifting Rule and Theorem 8.34, we 


conclude that 
L, n i 
SF gltlyton = gh@en/2 i om 
q 


LEP, 
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Adding these equations for k ranging from 0 to n and comparing to (8.5), we obtain the 
special case of the theorem where x = 1. 
To prove the general case, replace y by y/x in what we just proved to get 


n 


[li +¢'@/2)] =e etal (y/2)*. 


i=l 


Multiplying both sides by x” and simplifying, we obtain the needed result. O 


8.9 q-Multinomial Coefficients 


Let n,n1,...,Ns € Zo satisfy n = nj +--+ +s. Recall from Recursion 2.23 that the 
ordinary multinomial coefficients C(n;n1,...,s) = io if ) = satisfy 


nylons! 
s 
C(n;n1,..., 7s) SS CG 1yitay ss yty = Wyss Me), 


with initial conditions C(0;0,...,0) = 1 and C(m;mj,,...,ms) = 0 whenever some m, is 
negative. The g-multinomial coefficients satisfy the following analogous recursion. 


8.38. Theorem: Recursion for g-Multinomial Coefficients. Let n1,...,n, be non- 
negative integers, and set n = )>;_, nx. Then 


n = So ght “AN E-1 n—1 
| 
M1y-++)Ms J, = N1,---,M —1,...,NMs P 


where we interpret the kth summand on the right side to be zero if ny = 0. The initial 
condition is la, al - 1. Moreover, Byer is a polynomial in q with coefficients in Zo. 
Proof. Neither side of the claimed recursion changes if we delete all n;’s that are equal 
to zero; so, without loss of generality, assume every n; is positive. We can create a com- 
mon factor of [n — 1]!,/ Tair]! on the right side by multiplying the kth summand by 
[nk]q/[Ne]q, for 1 < k < s. Pulling out this common factor, we are left with 


So gtttnt +N 1 [nla =Si get +N 11+q+¢ eee gr), 
k=1 k=1 


The kth summand consists of the sum of consecutive powers of q starting at g™1tT"*-1 
and ending at g™'+"'+"*~!, Chaining these together, we see that the sum evaluates to 
go t+q't+---+¢q"! = [nq. Multiplying by the common factor mentioned above, we obtain 


To ~ mis ame | 


as needed. The initial condition follows from [0]!, = 1. Finally, we deduce polynomiality of 
the qg-multinomial coefficients using induction on n and the recursion just proved, as in the 
proof of Corollary 8.33. Oo 
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8.39. Theorem: g-Multinomial Coefficients and Anagrams. For any totally ordered 
alphabet A = {a, < ag <---< as} and all integers n1,...,n5 > 0, 


irre Ms = inv(w) 
ete = » q ‘ 
Proof. For all integers n,,...,5, define 


g(n1,.--,Ms) = > gone), 


wEeR(aj+ sean’) 


(This is zero by convention if any n; is negative.) By induction on 5°, ng, it suffices to show 
that g satisfies the recursion in Theorem 8.38. Now g(0,0,...,0) = q° = 1, so the initial 
condition is correct. Next, fix n1,...,ns; > 0, and let W be the set of words appearing in 
the definition of g(n1,...,;). Write W as the disjoint union of sets W,...,Ws, where Wy 
consists of the words in W with first letter a,. By the Sum Rule for Weighted Sets, 


g(m,...,ns) = Gw(a) = > Gw, (a): 
k=1 


Fix a value of k in the range 1 < k < s such that W; is nonempty. Erasing the first letter of a 


word w in W, defines a bijection from W;, to the set R(at! ---aZ*—!-»- als). The generating 
function for the latter set is g(n1,...,n, —1,...,ms). The bijection in question does not 


preserve weights, because inversions involving the first letter of w € W;, disappear when this 
letter is erased. However, no matter what word w we pick in Wz, the number of inversions 
that involve the first letter in w will always be the same. Specifically, this first letter (namely 
az) will cause inversions with all of the occurrences of a1, a@2,...,@,—1 that follow it in w. 
The number of such letters is nj +----+n,z—1. Therefore, by the Weight-Shifting Rule, 


Gw, inv(@) = illness) Coe sdnaey Mg Mace , Ms). 
This equation is also correct if W, = 0 (which occurs iff ny, = 0). Using these results in the 
formula above, we conclude that 


Ss 
g(m,.-.,Ns) = b? Geer OR Gis s ie = Levees ts), 
k=1 


which is precisely the recursion occurring in Theorem 8.38. O 


8.40. Remark. Theorem 8.39 can also be proved by generalizing (8.3) in the second proof 
of Theorem 8.34. Specifically, one can prove 


[ny +--+ + ns]!q = [ra]! > + [Ms]!q S gr) 
WER(1™1-.-s"s) 


by defining a weight-preserving bijection 
ft Snytertne 2 Sny X01 X Sn, X RL" +++ 8") 


where S;,, is the set of all permutations of {1,2,...,;}, and all sets in the Cartesian product 
are weighted by inversions. We leave the details as an exercise. 


344 Combinatorics, Second Edition 


8.10 Foata’s Bijection 


We know from Theorems 8.21 and 8.26 that >) <9, gee = all = ce: qn7i() where 
S,, is the set of permutations of {1,2,...,n}. We can express this result by saying that the 
statistics inv and maj are equidistributed on S,,. We have just derived a formula for the 
distribution of inv on more general sets of words, namely 


. pe 
qd => 7 
N1,+++, Ms 
WER(1"1--s"s) q 


Could it be true that inv and maj are still equidistributed on these more general sets? 
MacMahon proved that this is indeed the case. We present a combinatorial proof of this 
result based on a bijection due to Dominique Foata. For each set S = R(1™---s"s), our 
goal is to define a weight-preserving bijection f : (S,maj) > (S, inv). 

To achieve our goal, let W be the set of all words in the alphabet {1,2,...,5}. We shall 
define a function g : W > W with the following properties: (a) g is a bijection; (b) for all 
w € W, w and g(w) are anagrams (see §1.5); (c) if w is not the empty word, then w and 
g(w) have the same last letter; (d) for all w € W, inv(g(w)) = maj(w). We can then obtain 
the required weight-preserving bijections f by restricting g to the various anagram classes 
Rl" «++ 8"), 

We define g by recursion on the length of w € W. If this length is 0 or 1, set g(w) = w. 
Then conditions (b), (c), and (d) hold in this case. Now suppose w has length n > 2. Write 
w= w'yz, where w’ € W and y, z are the last two letters of w. We can assume by induction 
that u = g(w’y) has already been defined, and that u is an anagram of w’y ending in y such 
that inv(u) = maj(w’y). We define g(w) = hz(u)z, where h, : W > W is a certain map (to 
be described momentarily) that satisfies conditions (a) and (b) above. No matter what the 
details of the definition of hz, it already follows that g satisfies conditions (b) and (c) for 
words of length n. 

To motivate the definition of hz, we first give a lemma that analyzes the effect on inv 
and maj of appending a letter to the end of a word. The lemma uses the following notation. 
If u is any word and z is any letter, let n<,(u) be the number of letters in u (counting 
repetitions) that are < z; define n<z(u), ns-(u), and n>-(u) similarly. 


8.41. Lemma. Suppose u is a word of length m with last letter y, and z is any letter. (a) If 
y < z, then maj(uz) = maj(u). (b) If y > z, then maj(uz) = maj(u) + m. (c) inv(uz) = 
inv(u) + ns-(u). 


Proof. All statements follow routinely from the definitions of inv and maj. O 


We now describe the map h, : W — W. First, hz sends the empty word to itself. Suppose 
u is a nonempty word ending in y. There are two cases. 

Case 1: y < z. In this case, we break the word u into runs of consecutive letters such 
that the last letter in each run is < z, while all preceding letters in the run are > z. For 
example, if u = 1342434453552 and z = 3, then the decomposition of u into runs is 


u = 1/3/4,2/4,3/4, 4,5, 3/5, 5,2/ 


where we use slashes to delimit consecutive runs. Now, h, operates on u by cyclically shifting 
the letters in each run one step to the right. Continuing the preceding example, 


hg(u) = 1/3/2, 4/3, 4/3, 4, 4,5/2,5,5/. 
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What effect does this process have on inv(u)? In u, the last element in each run (which is 
< z) is strictly less than all elements before it in its run (which are > z). So, moving the 
last element to the front of its run causes the inversion number to drop by the number of 
elements > z in the run. Adding up these changes over all the runs, we see that 


inv(h,(u)) = inv(w) —nsz(u) in Case 1. (8.6) 


Furthermore, note that the first letter of h,(u) is always < z in this case. 

Case 2: y > z. Again we break the word wu into runs, but here the last letter of each 
run must be > z, while all preceding letters in the run are < z. For example, if z = 3 and 
u = 134243445355, we decompose wu as 


u = 1,3,4/2, 4/3, 4/4/5/3, 5/5/. 
As before, we cyclically shift the letters in each run one step right, which gives 


in our example. This time, the last element in each run of u is > z and is strictly greater 
than the elements < z that precede it in its run. So, the cyclic shift of each run will increase 
the inversion count by the number of elements < z in the run. Adding over all runs, we see 
that 

inv(h,(u)) =inv(u)+n<z(u) in Case 2. (8.7) 


Furthermore, note that the first letter of h,(w) is always > z in this case. 

In both cases, hz(u) is an anagram of u. Moreover, we can invert the action of hz as 
follows. Examination of the first letter of h,(u) tells us whether we were in Case 1 or Case 2 
above. To invert in Case 1, break the word into runs whose first letter is < z and whose 
other letters are > z, and cyclically shift each run one step left. To invert in Case 2, break 
the word into runs whose first letter is > z and whose other letters are < z, and cyclically 
shift each run one step left. We now see that h, is a bijection. For example, to compute 
hz | (1342434453552), first write 


1/3, 4/2, 4/3, 4, 4, 5/3, 5,5/2/ 
and then cyclically shift to get the answer 1/4,3/4, 2/4, 4,5,3/5,5,3/2/. 

Now we return to the discussion of g. Recall that we have set g(w) = g(w’yz) = hz(u)z, 
where u = g(w’y) is an anagram of w’y ending in y and satisfying inv(w) = maj(w’y). To 
check condition (d) for this w, we must show that inv(h,(u)z) = maj(w). Again consider 
two cases. If y < z, then on one hand, 

maj(w) = maj(w’yz) = maj(w’y) = inv(u). 
On the other hand, by Lemma 8.41 and (8.6), we have 
inv(h.(u)z) = inv(hz(u)) + nsz(hz(u)) = inv(u) — nsz(u) + nsz(u) = inv(w), 
where ns2(hz(u)) = nsz(u) since h,(u) and u are anagrams. If y > z, then on one hand, 
maj(w) = maj(w’yz) = maj(w’y) +n —1=inv(u)+n—-1. 


On the other hand, Lemma 8.41 and (8.7) give 


inv(h.(u)z) = inv(hz(u)) + ns.(hz(u)) = inv(u) + n<z(u) +ns2(u) = inv(u)+n-1, 
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since u has n — 1 letters, each of which is either < z or > z. 

To prove that g is a bijection, we describe the two-sided inverse map g~!. This is the 
identity map on words of length at most 1. To compute g~'(uz), first compute u! = hz! (u). 
Then return the answer g~'(uz) = (g~'(u’))z. Here is a non-recursive description of the 
maps g and g~!, obtained by unrolling the recursive applications of g and g~! in the 
preceding definitions. 


To compute g(w1w2-++Wp): for i = 2,...,n in this order, apply hy, to the first 7 — 1 


letters of the current word. 
To compute g~!(z1z2+++ Zn): fori =n,n—1,...,2 in this order, let z/ be the ith letter 
of the current word, and apply ha to the first 7 — 1 letters of the current word. 


8.42. Example. Figure 8.4 illustrates the computation of g(w) for w = 21331322. We find 
that g(w) = 23131322. Observe that maj(w) = 1+4+6 = 11 = inv(g(w)). Next, Figure 8.5 
illustrates the calculation of g~!(w). We have g~!(w) = 33213122, and inv(w) = 10 = 


maj(g~*(w)). 


ee: word: 


1,3,.3,1;3,2,2 
hi(2) = 2; es 
h3(2,1) = 2,1; 2,1,3,3,1,3,2,2 
h3(2,1,3) = 2,1,3; 2,1,3,3,1,3,2,2 
hy(2,1,3,3) = 2,3,1,3; 2,3,1,3,1,3,2,2 
h3(2,3,1,3,1) = 2,3,1,3,1; 2,3,1,3,1,3,2,2 
ho(2,3,1,3,1,3) = 3,2,3,1,3,1; 3, 2,3,1,3,1,2,2 
ho(3, 2,3,1,3,1,2) = 2,3,1,3,1,3,2; 2,3,1,3,1,3,2,2 
FIGURE 8.4 
Computation of g(w). 
current word: 
2,1,3,3,1,3,2,2 
he OAS AAS). = 23,91,5,1,9:. 3,9, 3,41,5,109,9 
he 2 331,5,4) = 38,938,115 3,0; 230, 1, 1,2,2 
Ae (3, 3,2,3,1) = 33,2, 1,3: 33,2, 1,3,1,2,2 
he (3.21). = 33,3, % a he ale ee ee a) 
he (3,5,2). = 332: $3.01,3.1;5.9 
a3) = 2.3 $4.9, 1,3, 19.9 
ig (3) = 33 9,3,90 1,3, 19.2 
FIGURE 8.5 
Computation of g~'(w). 
We summarize the results of this section in the following theorem. 
8.43. Theorem. For all n1,...,n,; > 0, 
maj(w inv(w Ny rst + Ns 
a qneie) = >» q =| Wiijss a5 Ue | ; 
WER(1"1-.-8%s ) WER(1"1---s"s) q 
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More precisely, there is a bijection on R(1™ ---s”"s) sending maj to inv and preserving the 
last letter of each word. 


8.11 q-Catalan Numbers 


In this section, we investigate two weighted analogues of the Catalan numbers. Recall that 
the Catalan number C,, = 4") = ba) = ae counts the collection of all lattice 
paths from (0,0) to (n,n) that never go below the line y = x (see §1.14). Let D, be the 
collection of these paths, which are called Dyck paths. Also, let W, be the set of words that 
encode the Dyck paths, where we use 0 to encode a north step and 1 to encode an east step. 


Elements of W,, are called Dyck words. 


8.44. Definition: Statistics on Dyck Paths. For every Dyck path P € D,, let area(P) 
be the number of complete unit squares located between P and the line y = a. If P is 
encoded by the Dyck word w € Wy, let inv(P) = inv(w) and maj(P) = maj(w). 


For example, the path P shown in Figure 8.6 has area(P) = 23. One sees that inv(P) 
is the number of unit squares in the region bounded by P, the y-axis, and the line y = n. 
We also have inv(P) + area(P) = ($) since (3) is the total number of area squares in the 
bounding triangle. The statistic maj(P) is the sum of the number of steps in the path that 
precede each left-turn where an east step (1) is immediately followed by a north step (0). For 
the path in Figure 8.6, we have inv(P) = 97 and maj(P) = 4+6+10+16+18+22+24+28 = 
128. 


(n,n) 


(0,0) 


FIGURE 8.6 
First-return analysis for weighted Dyck paths. 


8.45. Example. When n = 3, examination of Figure 1.3 shows that 


GDz,area(q) 1+2q-+ ¢ q: 


Gp, inv(q) l+qt2¢°+@°; 
=, 2 3 4 6 
Gps maj(@) = 1+e°+9"+¢° +a”. 


l| 
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When n = 4, a longer calculation gives 


Gpyarea(Q) = 14+3q¢4+3¢?+3q? + 2¢4 +9? +¢°: 


Gormaj(Q) = 1474+ @ 42+ P 42h +g +28 +P 4+ q+ ¢". 


There is no particularly simple closed formula for Gp, area(q), although determinant for- 
mulas do exist for this polynomial. However, these generating functions satisfy a recursion, 
which is a g-analogue of the first-return recursion used in the unweighted case (see §2.10). 


8.46. Theorem: Recursion for Dyck Paths Weighted by Area.For all n > 0, set 
Cn(q) = Gp,,,area(d) = Do pep, gte(P), Then Co(q) = 1, and for all n > 1, 


Cn (a) =} a*-*Cx-1(G)Cn-2(4). 


k=1 


n 


Proof. We imitate the proof of Theorem 2.28, but now we must take weights into account. 
For 1<k <n, write Dy, for the set of Dyck paths ending at (n,n) whose first return to 
the line y = x occurs at (k,k). Since Dy, is the disjoint union of the D,,,, as k ranges from 
1 to n, the Sum Rule for Weighted Sets gives 


Cn(q) = Cn, pee) (8.8) 
k=1 


For fixed k, we have a bijection from D,, to Dg-1 x Dn—x defined by sending P = 
N, Pi, E, Pz to (Pi, P2), where the displayed E is the east step that arrives at (k,k). See 
Figure 8.6. Examination of the figure shows that 


area(P) = area(P,) + area(P2) + (k — 1), 


where the k — 1 counts the shaded cells in the figure that are not included in the calculation 
of area(P,). By the Product Rule and Weight-Shifting Rule, we see that 


GDik seen) = Cr-1 (q)Cn—k (q)qs* : 
Inserting this expression into (8.8) proves the recursion. O 


Now let us consider the generating function Gp,, maj(q). This polynomial does have a 
nice closed formula, as we see in the next theorem. 


8.47. Theorem: Dyck Paths Weighted by Major Index. For all n > 0, 
F 2n 2n 1 2n 
Gonante)= So gmt (2) gf 2%] at fm). 
j >> n,n, n—-1,n+1 ; [n+l], [n,n g 


Proof. The last equality follows from the manipulation 


eel lente ~ fear (1- ah 


Met lg [n+ 1]q 


ll 
| es: | 
Sw 
a2 
| 
Q 
Ss: 
+le 
ae 
= 


The second equality in the theorem statement can be rewritten 


2n : 2n 
maj(P) _ 
Bl est tu a q he 
q PEDy q 
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We now give a bijective proof of this result reminiscent of André’s Reflection Principle (see 
the proof of the Dyck Path Rule 1.101). Consider the set of words S = R(0"1"), weighted 
by major index. By Theorem 8.43, the generating function for this set is Feed . On the other 
hand, we can write S as the disjoint union of W,, and T’, where W,, is the sat of Dyck words 
and T consists of all other words in R(0"1"). We define a bijection g : T > R(0"*11"~+) 
such that maj(w) = 1+ maj(g(w)) for all w € T. By the Bijection Rule and the Weight- 
Shifting Rule, it follows that 


2n 2n 
El ; = G's.maj(q) = Gr maj(q) + Gw,,,maj(q) =q F _ Ln ie il, + Gp,,,maj(q), 
as needed. 
FIGURE 8.7 


The tipping bijection. 


To define g(w) for w € T, regard w as a lattice path in an n x n rectangle by interpreting 
0’s as north steps and 1’s as east steps. Find the largest k > 0 such that the path w touches 
the line y = « —k. Such a k must exist, because w € T is not a Dyck path. Consider the 
first vertex v on w that touches the line in question. (See Figure 8.7 for an example.) The 
path w must arrive at v by taking an east step, and w must leave v by taking a north step. 
These steps correspond to certain adjacent letters w; = 1 and w;; = 0 in the word w. 
Furthermore, since v is the first arrival at the line y = x — k, we must have either 7 = 1 or 
wi-1 = 1 (ie., the step before w; must be an east step if it exists). Let g(w) be the word 
obtained by changing w; from 1 to 0. Pictorially, we tip the east step arriving at v upwards, 
changing it to a north step (which causes the following steps to shift to the northwest). The 
word w =---1,1,0--- turns into g(w) =---1,0,0---, so the major index drops by exactly 
1 when we pass from w to g(w). This result also holds if i = 1. The new word g(w) has 
n—1 east steps and n +1 north steps, so g(w) € R(O"T11"~4). 

Finally, we show g is invertible. Given a word (or path) P € R(0"*11"~1), take the 
largest k > 0 such that P touches the line y = x — k, and let v be the last time P touches 
this line. Here, v is preceded by an east step and followed by two north steps (or v is the 
origin and is followed by two north steps). Changing the first north step following v into 
an east step produces a path g/(P) € R(0"1”). It can be checked that g’(P) cannot be a 
Dyck path (so g’ maps into T), and that g’ is the two-sided inverse of g. The key point to 
verify is that the selection rules for v ensure that the same step is tipped when we apply g 
followed by g’, and similarly in the other order. O 
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8.12 Set Partitions and g-Stirling Numbers 


Recall from §2.12 that the Stirling number of the second kind, denoted S(n,k), counts set 
partitions of {1,2,...,n} into k blocks. This section develops a g-analogue of S(n,k) that 
counts weighted set partitions. 

To define an appropriate statistic on set partitions, we use the following method of 
encoding a set partition P by a permutation f(P). Recall that a set partition of {1,2,...,n} 
into k blocks is a set P = {B), Bo,..., By} of pairwise disjoint, nonempty blocks B; such 
that the union of B,,..., By is {1,2,...,n}. We choose the indexing of the blocks so that 
min(B,) > min(B2) > --- > min(B,), where min(B,;) denotes the least element of block 
B;. Furthermore, we present the elements within each block B; in increasing order. Finally, 
we erase all the set braces to obtain a permutation of {1,2,...,n}. For example, given the 
set partition P = {{8, 4,5}, {7,2}, {1,3, 9}, {6}}, we first present the partition in standard 
form P = {{6}, {4,5, 8}, {2,7}, {1, 3, 9}} and then erase the braces to get the permutation 
f(P) = 645827139. 

Note that each block B; in P becomes an ascending run in f(P) consisting of symbols 
that increase from left to right. On the other hand, for all 7 < k, the largest symbol in block 
B, must be greater than the smallest symbol in block Bj+1, because max(B,;) > min(B;) > 
min(B;41) by the choice of indexing. Thus there is a descent in f(P) every time we go 
from one block to the next block. It follows that we can recover the set partition P from 
the permutation f(P) = w)--- wp by finding the descent set Des(f(P)) = {i1 < ig <-+-< 
izx—1} and setting 


a {{w1, We,.. ., Wi, }, (Wi, 41, -- .) Wig}, (Wind) - an , Wis },- weg (Wigs 0 ,Wnt}. 


For example, f~'(794582613) = {{7, 9}, {4,5, 8}, {2, 6}, {1, 3}}. 

Let SP(n,k) be the set of set partitions of {1,2,...,n} into k blocks. The preceding 
discussion defines a one-to-one map f : SP(n,k) > S;,. We call permutations in the image of 
f Stirling permutations of type (n,k). A permutation w is in the image of f iff des(w) = k—1 
and the list of first elements in the ascending runs of w form a decreasing subsequence of 
w. We define a major index statistic on SP(n,k) by letting maj(P) = maj(f(P)) for each 
P€SP(n,k). The q-Stirling numbers of the second kind are defined to be 


Sq(n, k) — G'gP(n,k),maj (q) = > ga). 
PESP(n,k) 


These g-Stirling numbers are characterized by the following recursion (compare to Recur- 
sion 2.45). 


8.48. Theorem: Recursion for q-Stirling Numbers of the Second Kind. For all 
integers k and n with l@<k<n, 


Sq(n,k) = ame ~~ iF k— 1) ar [A]qgSq(n ~~ 1k). 


The initial conditions are $,(0,0) = 1, $,(n,1) = 1 for all n > 1, and S,(n,n) = qr\r-Y/? 
for alln > 1. 


Proof. Fix integers k and n with 1 << k <n. The set SP(n,k) is the disjoint union of sets 
A and B, where A consists of set partitions of {1,2,...,n} into k blocks where n is in a 
block by itself, and B consists of set partitions where n is in a block with other elements. 
There is a bijection from A to SP(n —1,k —1) that sends P € A to the set partition P’ 
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obtained from P by deleting the block {n}. Consider the Stirling permutations w = f(P) 
and w’ = f(P’) that encode P and P’. From the definition of the encoding, we see that 
w =n,w’ where w’ is some permutation of {1,2,...,n —1} with k — 2 descents. Adding 
n as the new first symbol pushes each of these descents to the right one position, and we 
also get a new descent at position 1 of w, since wy = n > wo = w}. This means that 
maj(w) = maj(w’) + k — 1. By the Weight-Shifting Rule, we conclude that 


y quai) - gk} > guailP) = q*1S,(n Ah 1). (8.9) 
PEA P'ESP(n—1,k—-1) 


We can build each set partition P € B by first choosing a set partition P* = 
{B,, Bo,...,Be} € SP(n —1,k), then choosing an index 7 € {1,2,...,k}, then insert- 
ing n as a new element of block B;. Here, as above, we index the blocks of P* so that 
min(B,) > min(B2) > --- > min(B,). Let us compare the permutations w = f(P) and 
w* = f(P*). We obtain w from w* by inserting the symbol n at the end of the jth ascend- 
ing run of w*. This causes the k — 7 descents following this run in w* to shift right one 
position in w, whereas the 7 — 1 descents preceding this run in w* maintain their current 
positions. Also, since n is the largest symbol, no new descent is created by the insertion 
of n. It follows that maj(w) = maj(w*) +k — j. The generating function for the choice 
of P* is S,(n — 1,k), whereas the generating function for the choice of j € {1,2,...,k} is 
gk 1+ qe? +4---+¢° = [k]q. Thus, by the Product Rule and Bijection Rule for Weighted 
Sets, we get 

S 5 gi) = [k]gSq(n — 1,k). (8.10) 
PEB 
The recursion in the theorem follows by adding (8.9) and (8.10). 

The initial condition $,(0,0) = q° = 1 holds since the major index of the empty word is 0. 
The initial condition S,(n, 1) = q° = 1 holds since the unique set partition P of {1,2,...,n} 
with one block has w(P) = 1,2,...,n and maj(w(P)) = 0. The initial condition 5,(n, ) 
q'("—-0)/2 holds since the unique set partition P € SP(n,n) has w(P) =n,n—1,...,3,2, 
and maj(w(P)) =14+2+---+(n—1) =n(n—-1)/2. 


Orv |i 


Summary 


e Generating Functions for Weighted Sets. A weighted set is a pair (S,wt) where S is a set 
and wt : S —> Zo is a function called a statistic on S. The generating function for this 
weighted set is Gswe(¢) = Gs(q) = Dicg (*™. Writing Gs(q) = Nyx ang*, ax is the 
number of objects in S' having weight k. 


e The Bijection Rule and the Weight-Shifting Rule. A weight-preserving bijection from 
(S,wt) to (T,wt’) is a bijection f : S > T with wt’(f(z)) = wt(z) for all z € S. 
When such an f exists, Ggwi(q) = Grwt(q). More generally, if there is 6 € Z with 
wt'(f(z)) = b+ wt(z) for all z € S, then Grwwt(q) = ¢’Gs,we(q)- 


e The Sum Rule for Weighted Sets. Suppose ($;,w;) are weighted sets for 1 <i<k, S is 
the disjoint union of the S;, and we define w : S + Zso by w(z) = wi(z) for z € 5;. Then 
Gala) =>). Gs, @). 

e The Product Rule for Weighted Sets. Suppose S is a weighted set such that every u € S 
can be constructed in exactly one way by choosing u, € 5), choosing ug € S2, and so 
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on, finally choosing uz, € S,, and then assembling the chosen objects ui,...,uxz in a 
prescribed manner. Assume that whenever u is constructed from uy, u2,...,Ur, Wts(u) = 


wts, (u1) + wts, (uz) +---+wts, (ux). Then Gs(q) = [][i_, Gs,(q). 


q-Integers, q-Factorials, q-Binomial Coefficients, and q-Multinomial Coefficients. Suppose 
q is a variable, n,k,nj € Z>0,0< k <n, and )°,n; =n. We define [n]q = an gq = 
ea _ n Te n _ [n]!q n = [n]!q 

ae [n]!o = [Tzalele, KP = Toa, and ae ol = TEs: These are all 


polynomials in q with coefficients in Zo. 


Recursions for q-Binomial Coefficients, etc. The following recursions hold: 


[P]!o = [rn — 1]!q- [r]q- 


a+b]  ,la+b-1 rm ato—1|  la+ti—l ae ajat+b-1 
a,b — a—1,b 4 a,b—1 i a—1,b 4 Z a,b-1 a 


? 


s 
Pe | _ aa Mmyt-:-+tng-1 | 
May+++4Ms Ja fot M1,.-.,Mk—1,...,Ns]q 


Statistics on Words. Given a word w = w ,w2-::Wy using a totally ordered alphabet, 
Inv(w) = {(i,7) :i <j and w; > w;}, inv(w) = | Inv(w)|, Des(w) = {i< n: wy; > wisi}, 


des(w) = | Des(w)|, and maj(w) = Dicpes(wy i We have 
ik feeet “| _ Ss giv) = s oe, 
N1,+-++,Ms q weER(att--ats) weER(ayt--az*) 


The second equality follows from a bijection due to Foata, which maps maj to inv while 
preserving the last letter of the word. In particular, letting S,, = R(112'---n+) be the set 
of permutations of {1,2,...,n}, 


[n]!q _ » gree) = > gree, 


weSn weSn 


These formulas can be proved bijectively by mapping w € S,, to its inversion table (or 
major index table) (t,,...,¢1), where t; records the change in inversions (or major index) 
caused by inserting the symbol 2 into the subword of w consisting of symbols 1,2,...,i—1. 


The q-Binomial Theorem. For all variables x, y,q and all n € Zyo, 


i - . n| np 
(e+ qy)(xt Py)(a + ay) (e@ tay) = Yen? A ory, 
k=0 qd 


Weighted Lattice Paths. The q-binomial coefficient eal = [eS counts lattice paths in 
? qd ; qd 


an a x b (or b x a) rectangle, weighted either by area above the path or area below the 
path. This coefficient also counts integer partitions with first part at most a and length 
at most b, weighted by area, as well as anagrams in R(0°1"), weighted by inversions or 
major index. 


Weighted Dyck Paths. Let C,,(q) be the generating function for Dyck paths of order 
n, weighted by area between the path and y = x. Then Co(qg) = 1 and C,(qg) = 
re 1 CK-1(¢)Cn—z(q). The generating function for Dyck paths (viewed as words 
in R(0"1")) weighted by major index is mon al = bel -aq[,_2" hs 


n,n. nn n—1,n+1 
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e Weighted Set Partitions. We encode a set partition P as a Stirling permutation f(P) by 
listing the blocks of P in decreasing order of their minimum elements, with entries in each 
block written in increasing order. Define maj(P) = maj(f(P)) and S4(n,k) = > q@@i?) 
summed over all set partitions of {1,2,...,n} with k blocks. These g-analogues of the 
Stirling numbers of the second kind satisfy the recursion 


S,(n,k) = q*-'S,(n-—1,k-—1) + [k]gS(n—-1,k) forl<k<n 


with initial conditions $,(0,0) = 1, S,(n,1) = 1 for all n > 1, and S,(n,n) = gn(n-1)/2 
for alln > 1. 


rrr 


Exercises 


In the exercises below, S;, denotes the set of permutations of {1,2,...,n}, unless otherwise 
specified. 

8-1. Let S = R(a'b'c?), T = R(a'b?c!), and U = R(a*b'c!). Confirm that Gs inv(q) = 
Grinv(q) = Gujinv(q) (as asserted in Example 8.17) by listing all weighted objects in T and 
U. 


8-2. (a) Compute inv(w), des(w), and maj(w) for each w € S4. (b) Use (a) to find the 
generating functions G's, inv(q@), G'sy,des(q), and G's, maj(q). (c) Compute [4]!, by polynomial 
multiplication, and compare to the answers in (b). 

8-3. (a) Compute inv(w) for the following words w: 4251673, 101101110001, 314423313, 
55233514425331. (b) Compute Des(w), des(w), and maj(w) for each word w in (a). 

8-4. Confirm the formulas for Gp, area(q) and Gp, maj(q) stated in Example 8.45 by listing 
all weighted Dyck paths of order 4. 

8-5. (a) Find the maximum value of inv(w), des(w), and maj(w) as w ranges over S;,. 
(b) Repeat (a) for w ranging over R(1"!2"? ---s"s), 

8-6. Let S be the set of k-letter words over the alphabet [nj]. For w € S, let wt(w) be the 
sum of all letters in w. Compute Gg we (q). 

8-7. Let S be the set of 5-letter words using the 26-letter English alphabet. For w € S, let 
wt(w) be the number of vowels in w. Compute Ggwe(q). 

8-8. Let S be the set of all subsets of {1,2,...,n}. For A € S, let wt(A) = |Al|. Use the 
Product Rule for Weighted Sets to compute Gg.wi(q). 

8-9. Let S be the set of all k-element multisets using the alphabet [n]. For M € S, let 
wt(J) be the sum of the elements in WM, counting multiplicities. Express Gg w+(q) in terms 
of q-binomial coefficients. 

8-10. (a) How many permutations of {1,2,...,8} have exactly 17 inversions? (b) How many 
permutations of {1,2,...,9} have major index 29? 

8-11. (a) How many lattice paths from (0,0) to (8,6) have area 21? (b) How many words 
in R(0°1%27) have ten inversions? (c) How many Dyck paths of order 7 have major index 
30? 

8-12. Use an involution to prove )77_9(—L)*gh*-D/? [alg = 0 for all integers n > 0. 


8-13. Compute each of the following polynomials by any method, expressing the answer in 


the form Dys9 ang". (a) [le (b) [6l!e (©) [8], @ [ogal, ©) eles), © my [23),- 
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8-14. (a) Factor the polynomials [4],, [5]q, [6],, and [12], in Z[q]. (b) How do these polyno- 
mials factor in C[{q]? 


8-15. Compute 3] q in six ways, by: (a) simplifying the formula in Definition 8.30; (b) us- 
ing the first recursion in Theorem 8.32; (c) using the second recursion in Theorem 8.32; 
(d) enumerating words in R(0011) by inversions; (e) enumerating words in R(0011) by 
major index; (f) enumerating partitions contained in a 2 x 2 box by area. 


8-16. (a) Prove the identity | algebraically. 


N1,+-+5Mk 

(b) Give a combinatorial proof of the identity in (a). 
8-17. For 1 <i <3, let (Tj, w;) be a set of weighted objects. (a) Prove that idp, : T; + T; is 
a weight-preserving bijection. (b) Prove that if f : T; + T> is a weight-preserving bijection, 
then f~! : T2 — T; is weight-preserving. (c) Prove that if f : T, > Tz and g : Tz + T3 are 
weight-preserving bijections, so is go f. 


k 
Ye | = | Ny +e +MNE 
q po, a eg 


8-18. Prove the second recursion in Theorem 8.32: (a) by an algebraic manipulation; (b) by 
removing the first step from a lattice path in an a x b rectangle. 


8-19. Let f be the map used to prove (8.3), with a = b = 4. Compute each of the following, 
and verify that weights are preserved. 

(a) f (2413, 1423, 10011010) 

(b) (4321, 4321, 11110000) 

(c) (2134, 3214, 01010101) 

8-20. Let f be the map used to prove (8.3), with a = 5 and b = 4. For each w given 
here, compute f~'(w) and verify that weights are preserved: (a) w = 123456789; (b) w = 
371945826; (c) w = 987456321. 

8-21. Repeat the previous exercise, taking a = 2 and b = 7. 

8-22. Prove that [m1 +--+ + ms]!q¢ = [ni]!q---[ns]!q Se wER (IM -9"8) gh) by defining a 
weight-preserving bijection f : S;,4..-4n, > Sn, X ++: X Sn, X RAM +++ 8), 


8-23. (a) Find and prove a q-analogue of the identity >; (%) * = (*”) involving q-binomial 
coefficients (compare to Theorem 2.4 and Figure 2.1). (b) Similarly, derive a q-analogue of 
the identity y=, are a anes 

8-24. Let S be the set of two-card poker hands. For H € S, let wt(H) be the sum of the 
values of the two cards in H, where aces count as 11 and jacks, queens, and kings count as 


10. Find Gg.we(q). 
8-25. Define the weight of a five-card poker hand to be the number of face cards in the hand 
(the face cards are aces, jacks, queens, and kings). Compute the generating functions for the 


following sets of poker hands relative to this weight: (a) full house hands; (b) three-of-a-kind 
hands; (c) flush hands; (d) straight hands. 

8-26. Define the weight of a five-card poker hand to be the number of diamond cards in 
the hand. Compute the generating functions for the following sets of poker hands relative 
to this weight: (a) full house hands; (b) three-of-a-kind hands; (c) flush hands; (d) straight 
hands. 

8-27. Let T,, be the set of connected simple graphs with vertex set {1,2,...,n}. Let the 
weight of a graph in T,, be the number of edges. Compute Gr, (q) for 1 <n < 5. 

8-28. Let G be the bijection in §8.4. Compute G(341265) and G~1(3, 2,3, 0,0, 1), and verify 
that weights are preserved for these two objects. 

8-29. Let G be the bijection in §8.4. Compute G(35261784) and G~1(5,6, 4, 2, 3, 0, 1,0), 
and verify that weights are preserved for these two objects. 
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8-30. In §8.4, we constructed an inversion table for w € S;, by classifying inversions (i,j) € 
Inv(w) based on the left value w;. Define a new map F: S,, > [n] x In — 1] x --- x [1] by 
classifying inversions (i, 7) € Inv(w) based on the right value w;. Show that F is a bijection, 
and compute F'(35261784) and F~1(5, 6, 4, 3, 2,0, 1,0). 

8-31. Define a map F': S,, > [n] x [n—1] x --- x [1] by setting f(w) = (t1,...,tn), 
where t; = |{j : (¢,7) € Inv(w)}|. Show that F' is a bijection. (Informally, F’ classifies 
inversions of w based on the left position of the inversion pair.) Compute F'(35261784) and 
F-1(5,6,4,3, 2,0, 1,0). 

8-32. Define a map F': S,, > [n] x [nm — 1] x --- x [1] that classifies inversions of w based 
on the right position of the inversion pair (compare to the previous exercise). Show that F’ 
is a bijection, and compute F(35261784) and F~1(5, 6,4, 3, 2,0, 1,0). 

8-33. Let G be the bijection in §8.5. Compute G(341265) and G~1(3, 2,3, 1,0, 0), and verify 
that weights are preserved for these two objects. 

8-34. Let G be the bijection in §8.5. Compute G(35261784) and G~!(5, 6, 4, 2, 3,0, 1,0), 
and verify that weights are preserved for these two objects. 

8-35. (a) Define a bijection H : S,, > S, such that maj(H(w)) = inv(w) for all w € S;, by 
combining the bijections in §8.4 and §8.5. (b) Compute H(41627853) and H~!(41627853). 
(c) Compute H (13576428) and H~!(13576428). 


8-36. Coinversions. Define the coinversions of a word w = wy1W2--:Wry, denoted 
coinv(w), to be the number of pairs (i,7) with 1 <i <j <n and w; < wyj. Prove 
wER(IM1202...508) gooinv(w) — Ute le (a) by using a bijection to reduce to the corre- 
sponding result for inv; (b) by verifying an appropriate recursion. 

8-37. Given a word w = w1--+ Wn, let comaj(w) be the sum of all i < n with w; < wi4i, 
and let rlmaj(w) be the sum of n—i for alli < n with w; > wi41. Calculate >) 65. qm 
and See grimai(n) 

8-38. For w € S,,, let wt(w) be the sum of all i < n such that i+ 1 appears to the left of i 
in w. Compute G's, wt (q). 

8-39. (a) Suppose w = w 1 W2-+-Wn—1 is a fixed permutation of n— 1 distinct letters. Let a 
be a new letter less than all letters appearing in w. Let S be the set of n words that can be 
obtained from w by inserting a in some gap. Prove that 7 ,¢¢ g@@§@ = gm) [n],. (b) Use 
(a) to obtain another proof that }7 <5, gmail) = [n]la. 

8-40. Suppose k is fixed in {1,2,...,n}, and w = wiwe-++Wp-1 is a fixed permutation of 
{1,2,...,.k -—1,k41,...,n}. Let S be the set of n words that can be obtained from w by 
inserting k in some gap. Prove or disprove: 7 ,¢.¢ qm4@ = qr3@)[n]q. 

8-41. Define a cyclic shift function c: {1,2,...,n} + {1,2,...,n} by c(i) =i+1 fori <n, 
and c(n) = 1. Define a map C': S$, > S;, by setting C(wiw2--: Wn) = c(w1)c(we) +++ c(wn). 
(a) Prove: for all w € S;,, maj(C(w)) = maj(w)—1ifw, 4 n, and maj(C(w)) = maj(w)+n— 
1if wy, =n. (b) Use (a) to show combinatorially that, for l <k <7, Vues: we=kd 
go Ped. qn*i(), (c) Use (b), the Sum Rule, and induction to obtain another proof of 
Theorem 8.26. 


8-42. Prove the following g-analogue of the Negative Binomial Theorem: for all n € Zyo, 


maj(w) — 


[oe} 


1 _ k+n-1] , 
GH i) tg") >| tea a 


8-43. For all n > 1, all T C {1,2,...,n—1}, and1<k <n, let G(n,T,k) be the number 
of permutations w of {1,2,...,n} with Des(w) = T and w, = k. (a) Find a recursion for 
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the quantities G(n, T,k). (b) Count the number of permutations of 10 objects with descent 
set {2,3,5, 7}. 

8-44. Let w be the word 4523351452511332, and let h, be the map from §8.10. Compute 
hz(w) for z = 1,2,3,4,5,6. Verify that (8.6) or (8.7) holds in each case. 

8-45. Let w be the word 4523351452511332, and let h, be the map from §8.10. Compute 
h;'(w) for z = 1, 2,3, 4,5, 6. 

8-46. Compute the image of each w € S4 under the map g from §8.10. 


8-47. Let g be the map in §8.10. Compute g(w) for each of these words, and verify that 
inv(g(w)) = maj(w). (a) 4251673 (b) 27418563 (c) 101101110001 (d) 314423313 

8-48. Let g be the map in §8.10. Compute g~!(w) for each word w in the preceding exercise. 
8-49. Let g be the bijection in the proof of Theorem 8.47. Compute g(w) for each non-Dyck 
word w € R(0713). 

8-50. g-Fibonacci Numbers.(a) Let W,, be the set of words in {0,1}” with no two 
consecutive 0’s, and let the weight of a word be the number of 0’s in it. Find a recursion 
for the generating functions Gy, (q), and use this to compute Gy, (q). (b) Repeat part (a), 
taking the weight to be the number of 1’s in the word. 


8-51. Let T;, be the set of trees with vertex set {1,2,...,n}. Can you find a statistic on 


trees such that the associated generating function satisfies Gr, (q) = [n]?~?? 
8-52. Multivariable Generating Functions. Suppose S is a finite set, and w1,..., Wn : 


S — Zso are n statistics on S. The generating function for S relative to the n weights 
iy da. sty, 18 the polynomial Gg, (Gis.130n) = Sonee [aa Qi, Formulate and 
prove versions of the Sum Rule, Product Rule, Bijection Rule, and Weight-Shifting Rule 
for such generating functions. 

8-53. Recall that we can view permutations w € S, as bijective maps of {1,2,...,n} 
into itself. Define I: S, > S, by I(w) = w7! for w € Sy. (a) Show that Jo I = idg,. 
(b) Show that inv(J(w)) = inv(w) for all w € S,. (c) Define imaj(w) = maj(I(w)) for all 
w € S,. Compute the two-variable generating function Gn(q,t) = doyes, gmai(w) gimaj(w) 
for 1 <n <4. Prove that G,,(q,t) = Gr(t,¢q). 

8-54. Let g be the map in §8.10, and let IDes(w) = Des(w~) for w € Si. (a) Show that 
for all w € S;,, IDes(g(w)) = IDes(w). (b) Construct a bijection h: S, > S;, such that, for 
all w € Sy, inv(h(w)) = maj(w) and maj(h(w)) = inv(w). 

8-55. Let P,, be the set of integer partitions whose diagrams fit in the diagram of (n—1,n— 
2,...,2,1,0),ie., uw © Py iff (4) < nand yp; <n—iforl <i<n. Let G,(q) = er, ql#l. 
Find a recursion satisfied by G,,(q) and use this to calculate G5(q). What is the relation 
between G',(q) and the g-Catalan number C,,(q) from §8.11? 


8-56. For each set partition P, find the associated Stirling permutation f(P) and compute 

maj(P). (a) {{3, 7,5}, {8, 2}, {4, 1, 6}} (b) {{1, 2n}, {2, 2n — 1}, {3, 2n — 2},..., {n,n + 1}} 

(c) {{1, 2}, {3, 4},..., {2n — 1, 2n}} 

8-57. Find the q-Stirling numbers S,(4,k) for 1 < k < 4 by enumerating all weighted set 

partitions of {1,2,3,4} and the associated Stirling permutations. 

8-58. Use the recursion in Theorem 8.48 to compute the g-Stirling numbers S,(n,k) for 

1<k<n<e. 

8-59. For a set partition P = {B),...,B,} with min(B,) > --- > min(B;), show that 
Hi k 5 

maj(P) = )jj-1(k — #)|Bil. 

8-60. Let R(n,k) be the set of non-attacking placements of n — k rooks on the board A, 

(see 2.55). Define the weight of such a placement as follows. Each rook in the placement 


Permutation Statistics and q-Analogues 357 


cancels all squares above it in its column. The weight of the placement is the total number 
of uncanceled squares located due west of rooks in the placement. Find a recursion for the 
generating functions g(n,k) = Grn,n),wt(q), which are q-analogues of the Stirling numbers 
of the second kind. Compute these generating functions for 1 < k <n < 5. (Compare to 
the q-Stirling numbers S,(n, k) from §8.12.) 

8-61. A left-to-right minimum of a permutation w = w ,w2-:: Wr is a value w; such that 
w; = min{w1, wWe,...,wi}. For w € Sy, let Irmin(w) be the number of left-to-right minima 
of w. (a) Define a bijection from the set of permutations in S,, with k cycles onto the 
set {w € S,, : Irmin(w) = k}. (b) Define cg(n,k) = cs, armin(w)=e Pr ”), Which is a 
q-analogue of the signless Stirling number of the first kind. Prove the recursion 


Cq(n, k) = q"*eg(n — 1,k — 1) + [n — 1]qeq(n — 1,). 
For which & and n is the recursion valid? What are the initial conditions? 
8-62. Calculate the polynomials cy(n,k) in the preceding exercise for 1<k<n<5. 
8-63. Find and prove a matrix identity relating the q-Stirling numbers 5,(n,k) and 
(-1)"*eq(n, k). 
8-64. Prove that 


S,(n, k) = SA iq@ "| [k aie 


j=0 J []!q 
8-65. For n > 1, write Veg, gies) — ee dn,nq*. Prove that 
k+1 
(n+1 
AES —1)’ k+1-7)". 
On,k ys ) ( ; +1 -i) 


1=0 


8-66. Bounce Statistic on Dyck Paths. Given a Dyck path P € D,,, define a new weight 
bounce(P) as follows. A ball starts at (0,0) and moves north and east to (n,n) according to 
the following rules. The ball moves north vp steps until blocked by an east step of P, then 
moves east vp steps to the line y = x. The ball then moves north v, steps until blocked by 
the east step of P starting on the line x = vg, then moves east v; steps to the line y = a. 
This bouncing process continues, generating a sequence (v9, v1,-..,Us) of vertical moves 
adding to n. We define bounce(P) = )0;_) iui and Cn(q,t) = Vopep, gevent hy Penner te): 
(a) Calculate C,(q,t) for 1 < n < 4 by enumerating Dyck paths. (b) Let Cnx(q,t) = 
Pe HeetPah geter(P)zbounce(P) be the generating function for Dyck paths that start with 
exactly k north steps. Establish the recursion 


r+k—-1 
Cre(a,t) = >> poke cha | Cn—kr(Gt) 
: q 


r=1 


by removing the first bounce. Show also that C,(q,t) = t°-"Cn+1,1(¢,t). (c) Use the re- 
cursion in (b) to calculate C,,(q,t) for n = 5,6. (d) Prove that g’"—-Y/?C,,(q,1/q) = 
rep, qm@i(P), (e) Can you prove bijectively that C,,(q,t) = Cp(t,q) for all n > 1? (If so, 
please contact the author.) 

8-67. Let G,, be the set of sequences g = (go, 91,---;Jn—1) of nonnegative integers with 
go = 0 and gi41 < gi +1 for all i < n—1. For g € Gy, define area(g) = aes gi, and let 
dinv(g) be the number of pairs 7 < j with g;—g,; € {0,1}. (a) Find a bijection k: G;, + D, 
such that area(k(g)) = area(g) for all g € Gy. (b) Find a bijection h: G, > Dp», such that 
area(h(g)) = dinv(g) and bounce(h(g)) = area(g) for all g € G,, (see the previous exercise). 
Conclude that the statistics dinv, bounce, area (on G;,), and area (on D,,) all have the same 
distribution. 
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Notes 


The idea used to prove Theorem 8.26 seems to have first appeared in [56]. The bijection 
in §8.10 is due to Foata [33]. For related material, see [35]. Much of the early work on 
permutation statistics, including proofs of Theorems 8.43 and 8.47, is due to Major Percy 
MacMahon [85]. The bijective proof of Theorem 8.47, along with other material on g-Catalan 
numbers, may be found in [42]. The bounce statistic in Exercise 8-66 was introduced by 
Haglund [57]; for more on this topic, see Haglund’s book [58]. 


9 


Tableaux and Symmetric Polynomials 


In this chapter, we study combinatorial objects called tableaux. Informally, a tableau is a 
filling of the cells in the diagram of an integer partition with values that weakly increase 
reading across rows and strictly increase reading down columns. We use tableaux to give 
a combinatorial definition of Schur polynomials, which are examples of symmetric poly- 
nomials. The theory of symmetric polynomials nicely demonstrates the interplay between 
combinatorics and algebra. We give an introduction to this vast subject in this chapter, 
stressing bijective proofs throughout. 


DT 


9.1 Fillings and Tableaux 


Before defining tableaux, we review the definitions of integer partitions and their diagrams 
from §2.11. A partition! of n is a weakly decreasing sequence of positive integers with sum n. 
Given a partition pw = (141, W2,..-, We), we call jz; the ith part of ju. We write |u| = wit+-- +i 
for the sum of the parts of uw, and we write (uu) = k for the length (number of positive 
parts) of yz. It is convenient to define uw; = 0 for all i > (yu). The diagram of the partition 
Lis the set 

dai) = {i 9) € Bin & Bsn 1 StS 1G), 1< 7 <4), 


We represent the diagram of js as an array of unit squares with jp, left-justified squares in 
the ith row from the top. This is called the English notation for partition diagrams. Some 
authors use French notation for partition diagrams, in which there are j; squares in the ith 
row from the bottom. 

We obtain new combinatorial objects called fillings by putting a number in each box of 
a partition diagram. For example, taking « = (5,5, 2), here is a filling of shape p: 


{1} 1] 3]9] 1] 
516 


We can define fillings formally as follows. 


9.1. Definition: Fillings. Given a partition w and a set X, a filling of shape yw with values 
in X is a function T : dg(u) > X. 


For each cell (i,j) € dg(u), T(%,7) is the value appearing in that cell of the partition 
diagram. In the filling depicted above, we have T(1,1) = 4, T(1,2) = 3, T(1,3) = 3, 
T(1,4) = 7, T(1,5) = 2, T(2,1) = 1, and so on. In most cases, the alphabet X is the set 
[N] = {1,2,...,N} for some fixed N > 0, or Zyso, or Z. 

Next we define special fillings called tableaux (the terms semistandard tableaux, Young 


1In this chaper, the unqualified term “partition” will always refer to an integer partition rather than a 
set partition. 
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tableauz, and column-strict tableaux are also used). A tableau is a filling where the values 
weakly increase as we read each row from left to right, and the values strictly increase as 
we read each column from top to bottom. A standard tableau is a tableau with n boxes 
containing each value in {1,2,...,n} exactly once. We can state these definitions more 
formally as follows. 


9.2. Definition: Tableaux and Standard Tableaux. Given a filling T : dg(u) > Z, T 
is called a (semistandard) tableau of shape p iff T(i,7) < T(¢,j7 +1) for all i,7 such that 
(i,j) and (4,7 +1) both belong to dg(j), and T(i, 7) < T(ié+ 1,7) for all i, 7 such that (7, 7) 
and (¢+1,7) both belong to dg(). A tableau T is called a standard tableau of shape ju iff 
T is a bijection from dg(y) to {1,2,...,n}, where n = |p]. 


Note that the plural form of “tableau” is “tableaux;” both words are pronounced tab—loh. 


9.3. Example. Consider the three fillings of shape (3, 2,2) shown here: 


[11] 3] [1 |2|6] [1] 2|5| 
Ti =(3/4| T2 =(3]5| 73 =([3]2| 
[5] 5] [4] 7| [4] 5] 


T, is a tableau that is not standard, and 7% is a standard tableau. 73 is not a tableau, 
because of the strict decrease 3 > 2 in row 2, and also because of the weak increase 2 < 2 
in column 2. Here is a tableau of shape (6,5, 3, 3): 


9.4. Example. There are five standard tableaux of shape (3,2), as shown here: 


ro es re ie 5, -EEEI 
[4 [5 | [3 [5] [3 [4] [2 [5] 2 | 4 
As mentioned in the Introduction, there is a surprising formula for counting the number of 
standard tableaux of a given shape ys. We discuss this formula in §12.12. 


We introduce the following notation for certain sets of tableaux. 


9.5. Definition: SSYTy (uw), SSYT(u), and SYT(). Given a partition y and a positive 
integer N, let SSYTy (ss) be the set of all (semistandard Young) tableaux of shape yw with 
values in {1,2,...,N}. Let SSYT(yw) be the set of all tableaux of shape yz with values in 
Zso. Let SYT(,) be the set of all standard tableaux of shape w. 


For each fixed p and N, SYT() and SSYT y(y) are finite sets, but SSYT (jz) is infinite. 


DS 


9.2 Schur Polynomials 


We now define a weight function on fillings that keeps track of the number of times each 
value appears. 


9.6. Definition: Content of a Filling. Given a filling T : dg() > Zo, the content of 
T is the infinite sequence c(T) = (c1,C2,...,Ck,---), where cy is the number of times the 
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value k appears in T. Formally, cx, = |{(7,7) € dg(w) : T(i, 7) = k}|. Given formal variables 
%1,U2,...,Uz,..., the content monomial of T is 


TOD) 2 Cp es) Ck _ 
xP ax) a tno = TL areg. 
(1,7) €dg (u) 


9.7. Example. The filling 7; shown in Example 9.3 has content c(Ti;) = 
(2,0,2,1,2,0,0,...) and content monomial x7 = 27x3a4x?. For the other fillings in that 
example, 


x? = L1MQAZLMAUM5LUMGEL7, x3 = Ly L5C3L4L-, x = LAL ALeLLgLy. 


All five standard tableaux in Example 9.4 have content monomial 71227327425. More gen- 


erally, if is a partition of n, then the content monomial of any S € SYT(j2) is a1a2-+- ap. 


We now define the Schur polynomials, which can be viewed as generating functions for 
semistandard tableaux weighted by content. 


9.8. Definition: Schur Polynomials. Given a partition 4 and an integer N > 1, the 
Schur polynomial in N variables indexed by tu is 


Si (City ey OS > x 


TESSYT y (11) 


9.9. Example. Let us compute the Schur polynomials s,,(71, 22, x3) for all w with |u| = 3. 
First, when 4s = (3), we have the following semistandard tableaux of shape (3) using the 
alphabet {1, 2,3}: 


Atay} (2272) L2f2 43} (22/2) L]2]3) 
1/3]3} [2]2]2} [2]2]3] [2]3]3} [3]3]3) 
It follows that 
8(3)(@1, 22,3) = BP + rx + epxg +2123 4212203 +. 2103 + 734-503 + 2923 + 23. 


Second, when py = (2,1), we obtain the following semistandard tableaux: 


iti] [ifi} [2]2} [4f2} [2f2} [2[3} (243) [2]3] 
2) By) By) 6B) 6B) BF By BI 


So 8(2,1) (#1, £2, €3) a xr + x a3 + xen + 202 + 2210003 + x23 + xox. Third, when 
pe = (1,1,1), we see that s(11,1)(%1, 22,73) = ©1243, since there is only one semistandard 
tableau in this case. 

Now consider what happens when we change N, the number of variables. Suppose first 
that we use N = 2 instead of N = 3. This means that the allowed alphabet for the tableaux 
has changed to {1,2}. Consulting the tableaux just computed, but disregarding those that 
use the letter 3, we conclude that 


Bay at On Be = 2 2, = 
§(3) (@1,@2) = 2] +a r2+01 03 +25; 8(2,1) (41, £2) = UjXQ+21 29; 8(1,1,1) (£1; £2) = 0. 


In these examples, note that we can obtain the polynomial s,,(%1, v2) from s,,(%1, £2, 3) 
by setting x3 = 0. More generally, we claim that for any u and any N > M, we can obtain 
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S.(%1,--.,0u) from s,,(v1,...,%M,..-,2N) by setting the last N — M variables equal to 
zero. To verify this, consider he defining formula 


8y(€1,%2,...,0N) = »s x. 


TESSYT Nn (p) 
Upon setting y41 = -:: = ey = 0 in this formula, the terms coming from tableaux T 
that use values larger than M become zero. We are left with the sum over T € SSYT yy (1), 


which is precisely s,,(%1,@2,..., ar). 
Suppose instead that we increase the number of variables from N = 3 to N = 5. Here 
we must draw new tableaux to find the new Schur polynomial. For instance, the tableaux 


for 4 = (1,1, 1) are: 
[1 | 
5 fi 
BA Hl 


8(1,1,1) (21; £2, 23, La, £5) = Ly LQe3 + @1L9G4 + U1 LoX5 + +++ + U3L4U5. 


Accordingly, 


9.10. Example. A semistandard tableau of shape? (1*) using the alphabet [N] = 
{1,2,...,N} is essentially a strictly increasing sequence of k elements of [N], which can 
be identified with a k-element subset of [N]. Combining this remark with the definition of 
Schur polynomials, we conclude that 


S(ik)(@1,---,2N) = S- Li ig Bee (9.1) 
1<i1 <ig<-+-<ip<N 


Similarly, a semistandard tableau of shape (k) is a weakly increasing sequence of k elements 
of [N], which can be identified with a k-element multiset using values in [N]. So 


S(g)(£1,---,2N) = >. Li, Vig + Li, (9.2) 


1Sit1 Sig <0 Sip- SN 


9.11. Example. Given any integer N > 4, what is the coefficient of x?73x324 in the Schur 
polynomial s(4,3)(21,...,2n)? The answer is the number of semistandard tableaux of shape 
(4,3) where 1, 2, and 3 each appear twice, and 4 appears once. Equivalently, we seek all 
tableaux with shape (4,3) and content (2,2,2,1). (It is customary to omit trailing zeroes 
from the content vector.) The tableaux satiehvine these conditions are shown here: 


fLfif2f2} [afifat3y = fafaf2t4} [afi f3]3] 

[313] 4 [213] 4 EABIBI 21214 
So the requested coefficient is 4. Next, what is the coefficient of 2123x2377? Now we must 
find the tableaux of shape (4,3) and content (1,2,2,2), which are the following: 


Again there are four tableaux, so the coefficient of 71 2323x7 is 4. Next, what is the coefficient 


?Recall that when listing the parts of integer partitions, 1" abbreviates a sequence of k ones. 
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of xtx2xr3x3? Drawing the tableaux of shape (4,3) and content (2,1,2,2) produces these 


objects: 
fiif2}3}) = aft f2t4}) fafa f3f3})— fa} 3 fa 
[3] 44 [313] 4] [2] 4] 4] [2{3] 4] 


The coefficient is 4 once again. One may check that for the content (2,2,1,2), and indeed 
for any rearrangement of (2,2,2,1,0,0,...) with all zeroes after position N, the number 
of semistandard tableaux of shape (4,3) having this content is always 4. This is not a 
coincidence; it is a consequence of the fact that Schur polynomials are symmetric, which we 
prove in §9.5. 


9.12. Remark. We have presented a combinatorial definition of Schur polynomials as a 
weighted sum of semistandard tableaux. We can also define Schur polynomials algebraically 
as a quotient of two determinants; see Theorem 10.45. Alternatively, we can define Schur 
polynomials via determinants of matrices whose entries are the symmetric polynomials e,, 
and h,, defined below; see Theorems 10.60 and 10.61. Many properties of Schur polynomials 
can be established either combinatorially or algebraically. In this text, we focus on the 
combinatorial proofs. Macdonald’s comprehensive monograph [84] approaches the subject 
from a much more algebraic perspective. 


( 


9.3. Symmetric Polynomials 


The examples of Schur polynomials computed in the last section were all symmetric; in 
other words, permuting the subscripts of the x-variables in any fashion did not change the 
answer. This section begins our examination of the general theory of symmetric polynomi- 
als. Throughout the discussion, we consider polynomials with real coefficients in variables 
%1,%2,...,uN, where N is a fixed positive integer. Everything said below applies more gen- 
erally to polynomials with coefficients in any field K containing Q (the rational numbers). 
The symbol K[21,22,...,2] denotes the set of all polynomials in x1,...,ay with coefhi- 
cients in kK. Given a polynomial f in this set, the expression f(a1,a2,...,@y) is the object 
we get by replacing each formal variable x; by a; for 1 <i < N. (This is a special case 
of an evaluation homomorphism, which is discussed in the Appendix.) Next, recall that 
Sw is the set of all permutations of the alphabet [N] = {1,2,...,N}, which are bijec- 
tions w : [N] > [N]. With this notation in hand, we can now formally define symmetric 
polynomials. 


9.13. Definition: Symmetric Polynomials. A polynomial f € R[7,..., ay] is symmet- 
ric iff 
Pita huteis<si4 Som) =f Wise:+ nN) for all w € Sy. 

This means that any permutation of the variables x; leaves f unchanged. Since any 
permutation can be achieved by a finite sequence of basic transpositions (by Theorem 7.29), 
f is symmetric iff for alli with 1 <7 < N, interchanging x; and x;4, in f leaves f unchanged. 

We now define some particular symmetric polynomials. 


9.14. Definition: Power Sums. For every k > 1, the kth power-sum polynomial in N 
variables is 
pp(1,@2,...,0n) = oP + ok ++ +08. 


For example, p3(%1, %2,%3,@4,%5) = 1? + 73 + v3 + 2} + 23. It is immediate that each 
power-sum polynomial p,(a1,...,2n) is indeed symmetric. 
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9.15. Definition: Elementary Symmetric Polynomials. For fixed k with l1@<k< JN, 
the kth elementary symmetric polynomial in N variables is 


(21, 22,...,0N) = > Vj, Vig Vine 
1<i1 <ig<---<ip SN 
We also set e9(@1,...,Un) = 1 and eg(a1,...,0n) =0 forall k > N. 

For example, e2(#1, @2,%3, U4) = 1 @2 + 4103 +2124 4+ GQu3 + %9K4 + 1324. One readily 
verifies that each e, really is symmetric. By formula (9.1), we see that ex(a1,...,uN) = 
8(1*)(@1,--.,@n), 80 that elementary symmetric polynomials are special cases of Schur poly- 
nomials. 


9.16. Definition: Complete Symmetric Polynomials. For fixed k > 1, the kth com- 
plete homogeneous symmetric polynomial in N variables is 


hy(a1,%2,...,2N) = S Li, Lig ++ Liz. 


LSi1 St2 50 Stp SN 
We also set ho(a1,...,0n) =1. 


One may verify that each hz really is symmetric. We call hy “complete” because it is the 
sum of all monomials of degree k in the given variables. For example, ho(x1, £2, 73) = 2? + 
034+23+2102+2103+ 2203. By formula (9.2), we see that hy(r1,...,0N) = 8(4)(@1,-..,£N), 
so that complete symmetric polynomials are also special cases of Schur polynomials. 

We now show how a partition ys of length at most N can be used to create a symmetric 
polynomial in N variables. First we need some notation. 


9.17. Definition: Sets of Partitions. Given integers k > 0 and N > 0, let Pary(k) be 
the set of partitions w with || = k and &(w) < N. (Recall that 4; = 0 for l(j) <i < N.) 
Let Pary be the set of partitions of length at most N. Let Par be the set of all partitions, 
and let Par(k) be the set of all partitions of k. 


For a = (a1,...,an) € Zp, let x* = rfa$? --- xh. Also, let sort(a) € Pary be the 
unique partition obtained by sorting the entries of a into weakly decreasing order. 


9.18. Definition: Monomial Symmetric Polynomials. Given a partition uw € Pary, 
the monomial symmetric polynomial in N variables indexed by pu is 


My(t1,...,0N) = yy x”, 


aeZX,: 
sort(a)=p 
Informally, m,(#1,...,¢n) is the sum of all distinct monomials xf? --- x’ whose ex- 


ponent vector can be rearranged to give js. For example, 


_ 3.3,.2 3.2.3 2,3,3 
3,3,2) (1, £2, €3) = 1 XZL3z + Lj LXZ + ©1273. 


Some of our previous examples are instances of monomial symmetric polynomials. Namely, 


we have pp(x1,...,2n) = Mx) (@1,--., Nn) and ex(21,...,2N) = Mary (1,--.,£N). 
Let us check that m,, really is symmetric. Given w € Sy with inverse w’, we have 
Aap! Ay! 
My (Bwlijss++7 lw") — a ey(1) ++ DUN) = S- zy (1) eat Wy). 
a: sort(a)=p a: sort(a)=yu 


The last step follows by noting that w(j) = 7 iff 7 = w’(i), so that 0) =," holds for 
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i = w(j). To continue, introduce a new summation variable 8 = (Q/(1),---,@w/(n)). The 
entries of 8 are obtained by rearranging the entries of a, so sort(3) = sort(a) = pw. As a 
ranges over all vectors in Ce that sort to 44, so does 8. The calculation therefore continues: 


> } rw! (1) Ow! (N) Bi Bn _ ; 
xy aN = Ly ly =m,(@1,...,0N). 
a: sort(a)=p B: sort(B)=p 


9.19. Definition: The Space Ay. Let Ay be the set of all symmetric polynomials in 
Ria1,...,¢N]. 


If two polynomials f and g are symmetric, so are f +g, —f, and fg. For example, given 
f,g € An and w€ Sy, 


(f9)(%w(1), tees Lw(N)) = J (@wieays tee »Xw(N))G9(Lw(1); tee ;Lay(N)) 
f(x1,.-.,@n)g(1,---,tNn) 


= (f9)(#1,---,2y). 


Also, any constant polynomial c is certainly symmetric, and hence any scalar multiple cf of 
a symmetric polynomial f is symmetric. These comments imply that Ay is a subring and 
vector subspace of R{a1,...,2], so Ay is a subalgebra of the real algebra of polynomials 
in N variables (see the Appendix for the definitions of these terms). 

We have just seen that Ay is closed under products. So, we can multiply together poly- 
nomials of the form ex, hy, or py to obtain even more examples of symmetric polynomials. 
This leads to the following definition. 


9.20. Definition: The Symmetric Polynomials e,, ha, and pa. Let a = (aj,...,@s5) 
be any sequence of positive integers. Define 


€a(@1,-..,0N) = [eden 
he(@isisc5tN) = [Basta ent) 


Pal®1,---,02nN) = |] Ba: Gi, ose): 


We call eg the elementary symmetric polynomial indexed by a; ha the complete homoge- 
neous symmetric polynomial indexed by a; and pq the power-sum symmetric polynomial 
indexed by a (in N variables). 


These definitions are most frequently used when a is a partition. Suppose the sequence 
a can be sorted to give the partition yw. Then eg = ey, ha = hy, and pa = Py, because 
multiplication of polynomials is commutative. More generally, if a and ( are rearrangements 
of each other, then eg = eg, ha = hg, and pa = pa. 


9.21. Remark. The power-sum polynomials py have already appeared in our discussion of 
Pélya’s Formula (§7.16), where they were used to count weighted colorings with symmetries 
taken into account. 


366 Combinatorics, Second Edition 


DT 


9.4 Vector Spaces of Symmetric Polynomials 


When studying symmetric polynomials, it is often helpful to focus attention on those poly- 
nomials that are homogeneous of a given degree. We recall that the degree of an individual 
monomial x® = af" --- aS" isay+---+ay. A polynomial p is called homogeneous of degree 
k iff every monomial x® appearing in p with nonzero coefficient has degree k. In particular, 


the zero polynomial is homogeneous of every degree. 


9.22. Definition: The Space A‘,. For all k > 0 and N > 0, let AX, be the set of symmetric 
polynomials p € Ay such that p is homogeneous of degree k. 


It can be checked that for all f,g €¢ Ak, andc € R, f+g € AX and cf € AX. This 
means that A‘, is a subspace of Ay. Furthermore, each symmetric polynomial can be written 
uniquely as a finite sum of its nonzero homogeneous components; this means that the vector 
space Ay is the direct sum of these subspaces: Av = QBp9 AX. Finally, for all p € Ak, and 
qe MA, we have pq € Fete , which means that this direct sum decomposition turns Aj into 
a graded algebra. 

The vector space A, is infinite-dimensional, but each homogeneous subspace A‘, is finite- 
dimensional. A recurring theme in the theory of symmetric polynomials is the problem of 
finding different bases of the vector space AX, and understanding the relations between these 
bases. We begin in this section by considering the most straightforward basis for this vector 
space, which consists of certain monomial symmetric polynomials. 


9.23. Theorem: Monomial Basis of A‘. For every k > 0 and N > 0, 
{my (21, aes ,tn) 2pE Pary(k)} 
is a basis for the vector space Ae. 


Proof. For  € Pary(k), recall that m,, is the sum of all distinct monomials x® such that 
ae a can be rearranged to give . Each of these monomials has degree |u| = k, so that 
each m, in the given set is homogeneous of degree k and thus belongs to A‘. Next, let 
us prove the linear independence of the given monomial symmetric polynomials. Suppose 
some linear combination of these polynomials is the zero polynomial, say 


> CuMy,(21,...,tn) =0 where c, € R. (9.3) 
we Parn(k) 


Consider a fixed v € Pary(k). Given any partition 4 # v, we cannot rearrange the parts of 
y to obtain p. It follows that m, is the only monomial symmetric polynomial in the sum 
in which x” appears with nonzero coefficient. The coefficient of x” in m, is 1. Extracting 
the coefficient of x” on both sides of (9.3) therefore gives c, - 1 = 0. Since v was arbitrary, 
every coefficient c, is zero, completing the proof of linear independence. 

Next, let us prove that the given monomial symmetric polynomials span A‘,. Let f € Af, 
be any homogeneous symmetric polynomial of degree k. For each pp € Pary(k), define d, € R 
to be the coefficient of x” in f. We claim that 


So dines) =F Gigcs, By): (9.4) 


ueParn(k) 


Since both sides are homogeneous of degree k, it suffices to check that the coefficients of x 
on both sides of (9.4) are equal, for all a € ZX, with a, +---+ay =k. Fix such an a, and 
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note that there is a unique partition v € Pary(k) such that sort(a) = v. As in the previous 
paragraph, the coefficient of x° is 1 in m, and is 0 in m, for all p ¢ v. Thus, the coefficient 
of x® in > ‘ dy,my, is dy. On the other hand, since f is symmetric, the coefficient of x% in 
f must be the same as the coefficient of x” in f, since some permutation of the variables 
changes x* into x”. Since the coefficient of x” in f is d, by definition, we are done. O 


9.24. Remark. In the previous theorem, we need to restrict attention to those partitions 
pe of length at most N, since m,,(a1,...,vN) is not defined when (yu) > N. On the other 
hand, if the number of variables N is at least k, then Pary(k) = Par(k) since the length 
of any partition of k is at most k. Therefore, when N > k, Ak, has basis {m,(a1,...,@N) : 
uw € Par(k)}, and the dimension of A‘, is p(k), the number of integer partitions of k. 


9.5 Symmetry of Schur Polynomials 


Recall the formula for Schur polynomials from Definition 9.8: 


Su(t1,...,0n) = s x 


TESSYT n (11) 


This section gives a bijective proof that all Schur polynomials are symmetric. First, we give 
names to the coefficients in these polynomials. 


9.25. Definition: Kostka Numbers. For each partition w € Pary and each a € TE 3 
define the Kostka number K,,,~ to be the coefficient of x® in s,(a1,...,¢N). So, Ky. is 
the number of semistandard tableaux of shape jz and content a. 


9.26. Example. The calculations in Example 9.11 show that 


K(4,3),(2,2,2,1) = K(4,3),(1,2,2,2) = K(4,3),(2,1,2,2) = 4- 
Similarly, we see from Example 9.4 that (3 9) (1,1,1,1,1) = 95- 
The following result is the key to proving the symmetry of Schur polynomials. 


9.27. Theorem: Symmetry of Kostka Numbers. For all partitions 4 € Pary and all 
a,BeE ZX, such that sort(a@) = sort(@), we have Ky = Ky,3. 


Proof. Fix ,a, 8 as in the theorem statement. Since sort(a@) = sort(), we can pass from a 
to 6 by an appropriate permutation of the entries of a. This permutation can be achieved 
in finitely many steps by repeatedly interchanging two consecutive entries of a (compare to 
Theorem 7.29).By induction, it is enough to prove the result when @ is obtained from a by 
switching a; and a;+, for some i. 

Let Y be the set of all tableaux T € SSYTy(w) such that c(T) = a, and let Z be the 
set of all tableaux T € SSYT (4) such that c(T) = 6. Since |Y| = Ky,q and |Z| = Ky,,,, it 
suffices to define a bijection f; : Y — Z. An input to the map f; is a semistandard tableau 
T of shape yz and content a. The output f;(T) must be a new semistandard tableau of shape 
js in which the number of 7’s and (7 + 1)’s are switched, while the number of k’s (for all 
k £i,i +1) is unchanged. We illustrate the action of f3 on the following tableau: 
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Observe that certain occurrences of 3 are matched with an occurrence of 4 in the cell directly 
below. Let us underline the 3’s and 4’s that are not part of these matched pairs: 


Ufa fi fifi jifi}2/3| 
/2]2[2]3/3]3/4]4] 4] 
[3]3[3]4]4]5 [516] 
415 [6]6|6| 


Notice that each row of the tableau contains a (possibly empty) run of consecutive cells 
consisting of underlined 3’s and 4’s. The entries directly above these cells (when they exist) 
are less than 3, while any entries directly below are greater than 4. So we are free to change 
the frequencies of 3’s and 4’s within each run without altering the fact that the filling is 
a semistandard tableau. If the run of underlined entries in a given row consists of 7 threes 
followed by k fours (where j,k > 0), we change this to a run consisting of k threes followed 
by 7 fours. Doing this in every row switches the overall frequency of 3’s and 4’s in the entire 
filling. Note in particular that the matched pairs are not altered, and these pairs contribute 
equally to the frequency counts for 3 and 4. Our example tableau is mapped by f3 to the 
following tableau: 


Applying the same run-modification process to this new tableau restores the original 
tableau; this means that fs is a bijection. As another example of the action of fs, we 


have: 
fh ( BESEETTTTS) ) = A ISTS |] 18) 


The definition of f; for general i is analogous. We locate and ignore matched pairs consisting 
of an i directly atop an i+ 1, then underline the remaining 7’s and (¢ + 1)’s, then switch the 
relative frequencies of the underlined 7’s and (i+ 1)’s in each row. This action maintains the 
property of being a tableau, switches the overall frequency of i’s and (t+ 1)’s, and preserves 
the frequency of all other letters. Applying the algorithm twice restores the original tableau, 
so we have found the required bijection. O 


9.28. Example. Let us trace through the preceding proof to construct bijections between 
the sets of tableaux in Example 9.11. We must chain together appropriate maps f;, where 
the values of 7 are chosen to rearrange the starting content vector a into the target content 
vector 3. For example, we can go from content (2,2, 2,1) to content (2,1,2,2) by applying 
fs and then fg. The first tableau of content (2,2,2,1) displayed in Example 9.11 maps to a 
tableau of content (2,1,2,2) via these steps: 


cope] «, GOPe) «, TOBE 
BIEIZ! Blt [ats 


If we continue by applying the map f1, we reach a tableau with content (1, 2, 2,2): 


Ai 3[3} A, [Af 2/313] 
[2] 4] 4) [2] 4] 4 


The inverse bijection is computed by applying the maps in the reverse order. For example, 
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the first tableau of content (1,2,2,2) in Example 9.11 is mapped to a tableau of content 
(2, 2,2,1) via the following steps. 


A, fs, fy, 
We can now deduce the symmetry of Schur polynomials. In fact, we can even expand 


these polynomials as specific linear combinations of monomial symmetric polynomials using 
the Kostka numbers. 


9.29. Theorem: Monomial Expansion of Schur Polynomials. For all partitions 
pe € Pary(k) and all N > 1, 


Sy(%1,...,0N) = xe KypMp(@1,-..,EN). (9.5) 
p€Parn(k) 
So, s,(%1,...,@n) is a symmetric polynomial belonging to the space A‘. 


Proof. We calculate as follows: 


a a 
Si(%1,...,0nN) = y Ky, ak? = y y KyoX 
aeZy, peParn(k) a€ZQo: 
sort(a)=p 

= oO a 
= > > KypX = . Kup > x 

peParn(k) aeZX,: pePary (k) acZyy: 

sort(a)=p sort(a)=p 

- y KypMp(t1,-..,0nN). 

p€Parn(k) 


The first equality follows from the definition of Kostka numbers. In the second equality, we 
reorganize the sum by grouping together those a € ZX, with sort(@) = p. We only need to 
sum over partitions p € Pary(k), since x? is a monomial of degree k for every tableau T of 
shape yw. The third equality follows from Theorem 9.27. The fourth equality uses the fact 
that K,,, does not depend on the inner summation index a. The final equality follows by 
definition of m,. Since s,, is a linear combination of basis elements of the subspace AX, we 
see that s, belongs to this subspace. In particular, s, is symmetric. O 


eS 


9.6 Orderings on Partitions 


We intend to use Theorem 9.29 to find bases for the vector spaces A, consisting of cer- 
tain Schur polynomials. For this purpose, we must first introduce some order relations on 
partitions. 


9.30. Definition: Lexicographic Ordering of Partitions. Given p,v € Par(k), we say 
that ps is lexicographically smaller than v, written ~ <iex v, iff either 4 = v or the first 
nonzero entry in the sequence v — pu is positive. 


In other words, pp <jex v iff ~ = v or there exists 7 such that wy = 11, 2 = V2, ..., 
lj—-1 =V;-1, and pr; < v;. This definition uses the convention that 4; = 0 for i > C() and 
vy, = 0 for i > &(v). It is routine to check that <jex is a total ordering of the set Par(k) for 
each k > 0. 
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9.31. Example. Here is a list of all integer partitions of 6, written in lexicographic order 
from smallest to largest: 
ia 1, 1, 1, 1, 1) Silex (2, i; 1, 1, 1) Slex (2, 2, 1, 1) <lex (2, 2, 2) Slex (3, I, 1, 1) 
<lex (3, 2, 1) Sex (3, 3) <Slex (4, 1, 1) <lex (4, 2) Sex (5, 1) Sex (6). 


In more detail, (3,1, 1,1) <1ex (3,2, 1) since 
(3, 2,1,0,0,...) — (3,1,1,1,0,...) = (0,1,0,-1,0,...) 
and the first nonzero entry in this sequence is positive. 


Tn later sections, we often need matrices and vectors whose rows and columns are indexed 
by integer partitions. Unless otherwise specified, we always use the lexicographic ordering 
of partitions to determine which partition labels each row and column of the matrix. For 
instance, a matrix A = [c,, : ,/ € Par(3)] is displayed as follows: 


€(1,1,1),(1,1,1) ©(4,1,1),(2,1) ~— ©(4,1,1), (3) 
A= €(2,1),(1,1,1) €(2,1),(2,1) €(2,1),(3) 
€(3),(1,1,1) €(3),(2,1) €(3),(3) 


We now define a partial ordering on partitions that occurs frequently in the theory of 
symmetric polynomials. 


9.32. Definition: Dominance Ordering on Partitions. Given partitions u,v € Par(k), 
we say that ps 7s dominated by v, written p< v, iff 
My + pote tpg Sy +rg4+---+y; foralli>1. 
Note that pw 4 v iff there exists 7 > 1 with wy +--+: +p; >, +--- +. 


9.33. Example. We have (2,2, 1,1) <1(4,2) since 2 < 4,24+2<4+4+2,24+24+1<4+42+40, 
and 2+2+1+1<4+2+40+0. On the other hand, (3,1,1,1) @ (2, 2,2) since 3 > 2, and 
(2, 2,2) @ (3,1,1,1) since 2+2+2>3+1+1. This example shows that not every pair of 
partitions is comparable under the dominance relation. 


9.34. Theorem: Dominance Partial Order. For all k > 0, the dominance relation is a 
partial ordering on Par(k). 


Proof. We show that <J is reflexive, antisymmetric, and transitive on Par(k). 


Reflexivity: Given pp € Par(k), we have 4) +--+ + fy < fi +--+ + yy for alli > 1. S0 wy. 
Antisymmetry: Suppose pz,v € Par(k), wiv, and vd. We know py +--+ + yi; < +--+; 
and also vy, +---+4;< fy +--+: +p; for all 2, hence wy +--+ + py = 4+--- +1; for all 


i > 1. In particular, taking i = 1 gives w, = ™. For each i > 1, subtracting the (¢ — 1)th 
equation from the 7th equation shows that uw; = 1;. S50 u=v. 

Transitivity: Fix p,v,p € Par(k), and assume uw <v and v <p; we must prove <p. We 
know py +--+ + yi Sy +--+ +; for all 2, and also 1, +--+ +14%< py +--+ +p; for all z. 
Combining these inequalities yields py + +--+ “i < pi +--:+ p; for all i, so pw <p. O 


One may check that <i is a total ordering of Par(k) iff k <5. 


9.35. Theorem: Lexicographic and Dominance Ordering. For all y,v € Par(k), if 
uly then pu <jex Vv. 


Proof. Fix p,v € Par(k) such that w Liex v; we will prove that « @ v. By definition of the 
lexicographic order, there must exist an index j > 1 such that yw; = vy; for all i < 7, but 
jt; > vj. Adding these relations together, we see that juj +---+ uj > 4 +--+: +v;, and so 
Av. O 
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The next definition and theorem allow us to visualize the dominance relation in terms 
of partition diagrams. 


9.36. Definition: Raising Operation. Let ~ and v be two partitions of k. We say that v 
is related to yz by a raising operation, denoted Rv, iff there exist i < j such that vy; = w;+1, 
Vj = fy — 1, and vs = ps for all s Fi, j. 


Intuitively, ~wRv means that we can go from the diagram for js to the diagram for v by 
taking the last square from some row of dg(j) and moving it to the end of a higher row. 


9.37. Example. The following pictures illustrate a sequence of raising operations. 


Ho PE Be 


Observe that (4,3,3,1,1)< (5,4, 2,1), so that the last partition in the sequence dominates 
the first one. Oe next a aha ie this always happens. 


9.38. Theorem: Dominance and Raising Operations. taba bt, v € Par(k), i <v iff 


there exist m > 0 and partitions y°,...,u such that p = y°, p’~'| Ry! for 1 <i<m, and 
wr =v. 
Proof. We first show that wRv implies dv. Suppose v = (f4,..., i +1,..., 4; —1,...) as 
in the definition of raising operations. Let us check that v1 +-+-+v_ > pr +-++++ pe holds 
for all k > 1. This is true for k < i, since equality holds for these values of k. If k = 7, then 
Wyte +p = pa tee + ia + (Mi +1) > a +--+ + yj. Similarly, for all k with i <k < j, 
we have yy +--+ +p = My tess Hue +1 > pw +--+ + pe. Finally, for all k > 7, we have 
Mites + pp = V1 +++: +, since the +1 and —1 adjustments to parts 7 and 7 cancel out. 

Next, suppose yz and vy are linked by a chain of raising operations as in the theorem 
statement. By the previous paragraph, we know p = yp, p’-! dp’ for 1 < i < m, and 
y™ =v. Since <I is transitive, we conclude that w <v, as needed. 

Conversely, suppose that sz <v. Consider the vector (di, d2,...) such that d, = (4. + 

+» +5) — (ur +--+ + ps). Since wv, we have d, > 0 for all s. Also, d; = 0 for all large 
enough s, since y and v are both partitions of k. We use strong induction on n = }>, ds to 
show that we can go from pu to v by a sequence of raising operations. If n = 0, then p = v, 
and we can take m = 0 and pp = p? = v. Otherwise, let i be the least index such that d; > 0, 
and let j be the least index after 7 such that d; = 0. The choice of 7 shows that ws = vs for 
alls <i, but uw; < yj. Ifi > 1, the inequality pw; < 4yj,< Vj, = Wi—1 shows that it is possible 
to add one box to the end of row 7 in dg() and still get a partition diagram. If i = 1, the 
addition of this box certainly gives a partition diagram. On the other hand, the relations 
d;—1 > 0, d; =0 mean that pay +---+pj-1 < +++ +j—1 but py +--+ py = Uy +++ +7;, 
so that ; > v;. Furthermore, from d; = 0 and dj, > 0 we deduce that pj41 < vj41. So, 
fy41 < Vj41 < Vz; < fy, which shows that we can remove a box from row j of dg(w) and 
still get a partition diagram. 

We have just shown that it is permissible to modify jz by a raising operator that moves 
the box at the end of row j to the end of row i. Let js! be the new partition obtained in this 
way, so that zRut. Consider how the partial sums pz; +---+ fs change when we replace 
p by pw. For s <i or s > j, the partial sums are the same for yw and p'. Fori < s < j, 
the partial sums increase by 1. Since d, > 0 in the range i < s < j, it follows that the 
new differences di, = (v1 +--+ Vs) — (ut +-+++ 3) are all nonnegative; in other words, 
pi dv. We have d, = d, —1 fori <s <j, and di =d, for all other s; so 0, di < Do, ds. 
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By the induction hypothesis, we can find a chain of raising operations linking pu to v. This 
completes the induction step. O 


As an application of the previous result, we prove a theorem relating the dominance 
ordering to the conjugation operation on partitions. We recall that for a partition py, the 
conjugate partition py’ is the partition whose parts are the column heights in the diagram 
of p. We obtain dg(ju’) from dg(j:) by interchanging rows and columns in the diagram. 


9.39. Theorem: Dominance and Conjugation. For all u,v € Par(k), wiv iffy’ dp’. 


Proof. Fix ,v € Par(k). We first claim that wRv implies v’ Ru’. This assertion follows from 
the pictorial description of the raising operation, since the box that moves from a lower row 
in dg(2) to a higher row in dg(v) necessarily moves from an earlier column to a later column 
(scanning columns from left to right). If we perform this move backward on the transposed 
partition diagrams, we see that we can pass from v’ to pz’ by moving a box in dg(v’) from 
a lower row to a higher row. 

Now, assume pp < v. By Theorem 9.38, there exist partitions p°,..., py" with p = p°, 
pw’ Rut for 1 <i<m, and p™ =v. Applying the claim in the previous paragraph, we see 
that (u")/R(u'—1)! for 1 < i < m, so we can go from v’ to p’ by a chain of raising operations. 
Invoking Theorem 9.38 again, we conclude that v’ <p’. 

Conversely, assume that v’ <p’. Applying the result just proved, we get p’’ dv”. Since 
pl” = and v" =v, we have dv. oO 


DT 


9.7 Schur Bases 


We now have all the necessary tools to find bases for the vector spaces A‘, consisting of 
Schur polynomials. First we illustrate the key ideas with an example. 


9.40. Example. In Example 9.9, we computed the Schur polynomials s,,(z1, 22, v3) for all 
partitions 44 € Par(3). We can use Theorem 9.29 to write these Schur polynomials as linear 
combinations of monomial symmetric polynomials m,(x1, x2, 23), where the coefficients are 
Kostka numbers: 


8411) = 1mq,1,1) 
821) = 2mqa,11) + lme,1) 
8(3) = Imai + 1me@,1) + lm). 


These equations can be combined to give the following matrix identity: 


§(1,1,1) 1 0 0 ™(1,1,1) 
§(2,1) — 2 1 0 (2,1) 
$(3) LL m3) 


The 3 x 3 matrix appearing here is lower-triangular with 1’s on the main diagonal, so this 
matrix is invertible. Multiplying by the inverse matrix, we find that 


™(1,1,1) 1 0 0 §(1,1,1) 
1m(2,1) = —2 1 0 §(2,1) 
m3) 1 -1 1 (3) 


This says that each monomial symmetric polynomial m, (a1, 72, 73) is expressible as a linear 
combination of the Schur polynomials s,,(v1, 72,23). Since {m, : v € Par(3)} is a basis of 
the vector space A$, the Schur polynomials must span this space. Since dim(A3) = p(3) = 3, 
the three-element set {s,,(71, 72,73) : w € Par(3)} is in fact a basis of A3. 
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The reasoning in this example extends to the general case. The key fact is that the 
transition matrix from Schur polynomials to monomial symmetric polynomials is always 
lower-triangular with 1’s on the main diagonal, as shown next. We refer to this matrix as a 
Kostka matriaz, since its entries are Kostka numbers. 


9.41. Theorem: Lower Unitriangularity of the Kostka Matrix. For all partitions 
A, Ky, = 1. For all partitions \ and p, Ky, 4 0 implies wd 2 (and also p <tex A, by 
Theorem 9.35). 


Proof. The Kostka number J‘), is the number of semistandard tableaux T of shape \ and 
content A. Such a tableau must contain A; copies of 7 for each z > 1. In particular, T contains 
A, l’s. Since T is semistandard, all these 1’s must occur in the top row, which has A; boxes. 
So the top row of T contains all 1’s. For the same reason, the A2 2’s in T must all occur 
in the second row, which has Az boxes. Continuing similarly, we see that T’ must be the 
tableau whose ith row contains all i’s, for all i > 1. Thus, there is exactly one semistandard 
tableau of shape A and content A. For example, when A = (4, 2,2,1), T is this tableau: 


For the second part of the theorem, assume 4X, js € Par(k) and K),,, 4 0; we prove <A. 
Since the Kostka number is nonzero, there exists a semistandard tableau T of shape \ and 
content uu. Every value in T is a positive integer. Because the columns of T must strictly 
increase, all 1’s in T must occur in row 1; all 2’s in T must occur in row 1 or row 2; and, 
in general, all j’s in T’ must occur in the top j rows of dg(\). Now fix i > 1. Since T has 
content js, the total number of occurrences of the symbols 1,2,...,2 in Tis py + +++ + py. 
All of these symbols must appear in the top i rows of dg(A); these rows contain Ay +---+ A; 
boxes. We conclude that wy +-+-+ uy <A, +---+A;. This holds for every 7, so wdA. O 


9.42. Example. Let \ = (3,2,2) and yw = (2,2,2,1). The Kostka number K),,, is 3, as we 
see by listing the semistandard tableaux of shape and content pu: 


In each tableau, all occurrences of i appear in the top 7 rows, and hence pu JX. 


9.43. Theorem: Schur Basis of Nees For allk > 0 and N > 0, 
{s)(@1, ar ,tN) :A€ Pary(k)} 
is a basis for the vector space AS. 


Proof. Let p = | Pary(k)|, and let S be the p x 1 column vector consisting of the Schur 
polynomials {s)(21,...,a%y): A € Pary(k)}, arranged in lexicographic order. Let M be the 
p x 1 column vector consisting of the monomial symmetric polynomials {m,,(z1,...,2~) : 
me © Pary(k)}, also arranged in lexicographic order. Finally, let K be the p x p matrix, 
with rows and columns indexed by elements of Pary(k) in lexicographic order, such that 
the entry in row A and column p is the Kostka number K),,,. Theorem 9.29 says that, for 
every A € Pary(k), 


$(@1,..-,¢N) = a Ky pMp(v1,...,2N). 
pe Parn(k) 
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These scalar equations are equivalent to the matrix-vector equation S = KM. Moreover, 
Theorem 9.41 asserts that K is a lower-triangular matrix of integers with 1’s on the main 
diagonal. So K has an inverse matrix (whose entries are also integers, since det(K) = 1). 
Multiplying on the left by this inverse matrix, we get M = K~'S. This equation means that 
every m,, is a linear combination of Schur polynomials. Since the m, generate A the Schur 
polynomials must also generate this space. Linear independence follows automatically since 
the number of Schur polynomials in the proposed basis (namely p) equals the dimension of 
the vector space, by Theorem 9.23. O 


9.44. Remark. The entries of the inverse Kostka matrix K~! tell us how to expand 
monomial symmetric polynomials in terms of Schur polynomials. As seen in the 3 x 3 exam- 
ple, these entries are integers that can be negative. We give a combinatorial interpretation 
for these inverse Kostka numbers in §10.16, using signed objects called special rim-hook 
tableaux. 


9.45. Remark. If \ € Par(k) has more than N parts, then s)(#1,...,2)) = 0. This follows 
since there are not enough values available in the alphabet to fill the first column of dg(A) 
with a strictly increasing sequence. So there are no semistandard tableaux of this shape 
using this alphabet. 


DS 


9.8 Tableau Insertion 


We have seen that the Kostka numbers give the coefficients in the monomial expansion of 
Schur polynomials. Surprisingly, the Kostka numbers also relate Schur polynomials to the 
elementary and complete homogeneous symmetric polynomials. This fact is a consequence 
of the Pieri Rules, which tell us how to rewrite products of the form s,,e; and s,,hz as linear 
combinations of Schur polynomials. 

To develop these results, we need a fundamental combinatorial construction on tableaux 
called tableau insertion. Given a semistandard tableau T of shape pu and a value z, we will 
build a new semistandard tableau by inserting x into 7’, as follows. 


9.46. Definition: Tableau Insertion Algorithm. Given a semistandard tableau T of 
shape pz and a value x € Z, define the insertion of x into T, denoted T + a, by the following 
procedure. 


1. If 4 = (0), so that T is the empty tableau, then T « z is the tableau of shape (1) 
whose sole entry is x. 


2. Otherwise, let y; < yo <--++ < Ym be the entries in the top row of T. 


2a. If ym < a, then T < « is the tableau of shape (41 + 1, u2,...) obtained by 
placing a new box containing x at the right end of the top row of T. 

2b. Otherwise, choose the least i € {1,2,...,m} such that x < y;. Let T’ be the 
semistandard tableau consisting of all rows of T’ below the top row. To form 
T <2, first replace y; by x in the top row of T. Then replace T’ by T’ © yj, 
which is computed recursively by the same algorithm. 


If step 2b occurs, we say that x has bumped y; out of row 1. In turn, y; may bump an 
element from row 2 to row 3, and so on. 
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This recursive insertion algorithm always terminates, since the number of times we 
execute step 2b is at most €(j), which is finite. We must also prove that the algorithm always 
produces a semistandard tableau of partition shape. We prove these facts after considering 
some examples. 


9.47. Example. Let us compute T’ < 3 for the following tableau T: 


We scan the top row of T from left to right, looking for the first entry strictly larger than 
3. This entry is the 4 in the fifth box. In step 2b, the 3 bumps the 4 into the second row. 
The current situation looks like this: 


Now we scan the second row from left to right, looking for the first entry strictly larger 
than 4. It is the 5, so the 4 bumps the 5 into the third row: 


Next, the 5 bumps the 7 into the fourth row: 


Now, everything in the fourth row is weakly smaller than 7. So, as directed by step 2a, we 
insert 7 at the end of this row. The final tableau T + 3 is therefore 


We have underlined the entries of T < 3 that were affected by the insertion process. These 
entries are the starting value x = 3 together with those entries that got bumped during the 
insertion. Call these entries the bumping sequence; in this example, the bumping sequence is 
(3, 4,5, 7). The sequence of boxes occupied by the bumping sequence is called the bumping 
path. The lowest box in the bumping path is called the new boz. It is the only box in T «+ 3 
that was not present in the original diagram for T. 

Here is a simpler exampler of tableau insertion: 


[1 }1/2/3/4]4]6]6| 
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The reader may check that: 


11} 1]3/ 4] 4/6) 


To prove that T < « is always a semistandard tableau of partition shape, we need the 
following result. 


9.48. Theorem: Bumping Sequence and Bumping Path. Given a semistan- 
dard tableau T and element x, let (#1,%2,...,2%) be the bumping sequence and let 
((1, 91), (2, j2),.--, (Kk, jx)) be the bumping path arising in the computation of T < x. Then 
B= 2, <2 < +++ < wy and Jy > jo > ++: > jx > 0. So the bumping sequence strictly 
increases and the bumping path moves weakly left as it goes down. 


Proof. By definition of the bumping sequence, x = x; and x; bumps x;+, from row 27 into 
rowi+1, for 1 <i<k. By definition of bumping, x; bumps an entry strictly larger than 
itself, so w; < x4, for all i < k. Next, consider what happens to x;4, when it is bumped 
out of row 7. Before being bumped, «41 occupied the cell (i, j;). After being bumped, x;+1 
will occupy the cell (¢+ 1, 9:41), which is either an existing cell in row i+1 of T, or a new 
cell at the end of this row. Consider the cell (¢ + 1, 7;) directly below (i, 7;). If this cell is 
outside the shape of T, the previous observation shows that (¢ + 1, 7;41) must be located 
weakly left of this cell, so that j;41 < j;. On the other hand, if (¢+1,j;) is part of T and 
contains some value z, then 2,41; < z because T' is semistandard. Now, x;;; bumps the 
leftmost entry in row i+ 1 that is strictly larger than x;41. Since z is such an entry, @j41 
bumps z or some entry to the left of z. In either case, we again have jj41 < ji. O 


9.49. Theorem: Output of a Tableau Insertion. If T is a semistandard tableau of shape 
Lt, then T «+ 2 is a semistandard tableau whose shape is a partition diagram obtained by 
adding one new box to dg(j1). 


Proof. Let us first show that the shape of T < z is a partition diagram. This shape is 
obtained from dg(j) by adding one new box (the last box in the bumping path). If this new 
box is in the top row, then the resulting shape is a partition diagram, being the diagram of 
(141 + 1, 42, 3,---). Suppose the new box is in row i > 1. In this case, Theorem 9.48 shows 
that the new box is located weakly left of a box in the previous row that belongs to dg(j.). 
This implies that py; < “;-1, so adding the new box to row 7 still gives a partition diagram. 

Next we prove that each time an entry of JT is bumped during the insertion of x, the 
resulting filling is still a semistandard tableau. Suppose, at some stage in the insertion 
process, that an element y bumps z out of the following configuration: 


(Some of the boxes containing a, b, c,d may be absent, in which case the following proof must 
be modified appropriately.) The original configuration is part of a semistandard tableau, so 
b<z<canda< z< d. Because y bumps z, z must be the first entry strictly larger than y 
in its row. This means that b < y < z < c, so replacing z by y still leaves a weakly increasing 
row. Does the column containing z still strictly increase after the bumping? On one hand, 
y <d, since y < z < d. On the other hand, if the box containing a exists (i.e., if z is below 
the top row), then y was the element bumped out of a’s row. Since the bumping path moves 
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weakly left, the original location of y must have been weakly right of z in the row above 
z. If y was directly above z, then a must have bumped y, and so a < y by definition of 
bumping. Otherwise, y was located strictly to the right of a before y was bumped, so a < y. 
We cannot have a = y in this situation, since otherwise a (or something to its left) would 
have been bumped instead of y. Thus, a < y in all cases. 

Finally, consider what happens at the end of the insertion process, when an element w 
is inserted in a new box at the end of a (possibly empty) row. This only happens when w 
weakly exceeds all entries in its row, so the row containing w is weakly increasing. There is 
no cell below w in this case. Repeating the argument at the end of the last paragraph, we 
see that w is strictly greater than the entry directly above it (if any). This completes the 
proof that T < «x is a semistandard tableau. O 
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9.9 Reverse Insertion 


Given the output T < « of a tableau insertion operation, it is generally not possible to 
determine what T and x were. However, if we also know the location of the new box created 
by this insertion, then we can recover T’ and x. More generally, we can start with any 
semistandard tableau S and any corner box of S, and uninsert the value in this box to 
obtain a semistandard tableau T and value x such that S = T ¢ «. (Here we do not 
assume in advance that S has the form T < 2.) This process is called reverse tableau 
insertion. Before giving the general definition, we look at some examples. 


9.50. Example. Consider the following semistandard tableau S: 


There are three corner boxes whose removal from S' still leaves a partition diagram; they 
are the boxes at the end of the first, third, and sixth rows. Removing the corner box in the 
top row, we have S = 7, « 4, where T{ is this tableau: 


Suppose instead that we remove the 6 at the end of the third row of S. Reversing the 
bumping process, we see that 6 must have been bumped into the third row from the second 
row. What element bumped it? In this case, it is the 5 in the second row. In turn, the 5 
must have originally resided in the first row, before being bumped into the second row by 
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the 4. In summary, we have S = T> < 4, where 7» is this tableau: 


Here we have underlined the entries in the reverse bumping sequence, which occupy boxes 
of S in the reverse bumping path. Finally, consider what happens when we uninsert the 8 
at the end of the last row of S. The 8 was bumped to its current location by one of the 
6’s in the previous row; it must have been bumped by the rightmost 6, since the 8 could 
not appear to the left of the 6 in the previous semistandard tableau. Next, the rightmost 6 
was bumped by the 5 in row 4; the 5 was bumped by the rightmost 4 in row 3; and so on. 
In general, to determine which element in row 7 bumped some value z into row i+ 1, we 
look for the rightmost entry in row 7 that is strictly less than z. Continuing in this way, we 
discover that S = T3 < 2, where 73 is the tableau shown here: 


With these examples in hand, we are ready to give the general definition of reverse 
tableau insertion. 


9.51. Definition: Reverse Tableau Insertion. Suppose S is a semistandard tableau 
of shape v. A corner box of v is a box (i,j) € dg(v) such that dg(v)—{(2, 7)} is still the 
diagram of some partition yw. Given S and a corner box (i,j) of v, we define a tableau T 
and a value w as follows. We construct a reverse bumping sequence (aj,@i-1,..-,%1) anda 
reverse bumping path ((i, ji), (@—1,9:-1),---, (1, j1)) as follows. 


1. Set j; = j and x; = S(t,7), which is the value of S in the given corner box. 


2. Once x, and jx, have been found, for some i > k > 1, scan row k — 1 of S for the 
rightmost entry that is strictly less than x,. Define x,_; to be this entry, and let 
jr—1 be the column in which this entry occurs. 


3. At the end, let « = 21, and let T be the tableau obtained by erasing box (¢, j;) 
from S and replacing the contents of box (k — 1,jp-1) by x, fori >k > 1. 


The next results show that reverse insertion really is the two-sided inverse of ordinary 
insertion (given knowledge of the location of the new box). 


9.52. Theorem: Properties of Reverse Insertion. Suppose we perform reverse tableau 
insertion on S and (i, 7) to obtain a filling T and value x. (a) The reverse bumping sequence 
satisfies x; > aj-1 > ++: > #1 = x. (b) The reverse bumping path satisfies 7; < jj-1 <--- < 
ji. (c) T is a semistandard tableau of partition shape. (d) (T «+ «) = S. 


Proof. Part (a) follows from the definition of x,_1 in the reverse insertion algorithm. Note 
that there does exist an entry in row k — 1 strictly less than x, since the entry directly 
above x, (in cell (k — 1,7.) of S) is such an entry. This observation also shows that the 
rightmost entry strictly less than 2, in row k — 1 occurs in column jx, or later, proving (b). 


Tableaux and Symmetric Polynomials 379 


Part (c) follows from (a) and (b) by an argument similar to that given in Theorem 9.49; we 
ask the reader to fill in the details. 

For part (d), consider the bumping sequence (2z/,25,...) and bumping path 
(1,94), (2,95),...) for the forward insertion T « a. We have v4 = % = 2 by defini- 
tion. Recall that 7, = S(1, 71) is the rightmost entry in row 1 of S that is strictly less than 
xg, and T (1,71) = x2 by definition of T. All other entries in row 1 are the same in S and 
T. Then T(1, 71) = 22 is the leftmost entry of row 1 of T strictly larger than x1. So, in 
the insertion T < 2, 2; bumps x2 out of cell (1,71). In particular, 7) = j, and 2 = ao. 
Repeating this argument in each successive row, we see by induction that 2, = x, and 
Jy = Jn for all k. At the end of the insertion, we have recovered the original tableau S. O 


9.53. Theorem: Reversing Insertion. Suppose S = (T « 2) for some semistandard 
tableau T and value x. Let (i, 7) be the new box created by this insertion. If we perform 
reverse insertion on S$ starting with box (i, 7), then we obtain the original T and . 


Proof. This can be proved by induction, showing step by step that the forward and reverse 
bumping paths and bumping sequences are the same. The reasoning is similar to part (d) 
of Theorem 9.52, so we ask the reader to supply the proof. O 


The next theorem summarizes the results of the last two sections. 


9.54. Theorem: Invertibility of Tableau Insertion. For a fixed partition 4 and positive 
integer N, let P() be the set of all partitions that can be obtained from y by adding a 
single box at the end of some row. There exist mutually inverse bijections 


I: SSYTy(u) x[N]> (J SSYTy(v), Rk: LJ SSYTw(v) + SSYTw(u) x [NI], 
vEeP(p) veP(p) 


where I(T,2) is T < x, and R(S) is the result of applying reverse tableau insertion to S$ 
starting at the unique box of S$ not in ys. A similar statement holds for tableaux using the 
alphabet Zyo. 


Proof. We have seen that J and R are well-defined functions mapping into the stated 
codomains. Theorem 9.52(d) says that Io R is the identity map on U,<py,,) SSYTn(v), 
while Theorem 9.53 says that RoI is the identity map on SSYTy (yw) x [N]. Hence I and 
R are bijections. oO 


We can regard the set [N] = {1,2,...,N} as a weighted set with wt(7) = x2; for 1 <i< 
N. The generating function for this weighted set is 71 + a2 +---+ay =hi(m1,...,uN) = 
8) (@1,---,@n) = €1(1,--..,2n) = pi(21,...,£N). The content monomial xi td is x a, 
since T < j contains all the entries of T together with one new entry equal to 7. This means 
that wt(I(T, j)) = wt(T) wt(j), so that the bijection J in Theorem 9.54 is weight-preserving. 
Using the Product Rule for Weighted Sets and the definition of Schur polynomials, the 
generating function for the domain of I is s,,(r1,...,¢n)hi(v1,...,@n). Using the Sum Rule 
for Weighted Sets, the generating function for the codomain of I is ove Pt) S)(@1,...,@N). 
In conclusion, the tableau insertion algorithms have furnished a combinatorial proof of the 
following multiplication rules: 


Spy = $y€1 = $y $(1) = SuP1 y Sy, 
vEeP(p) 


where we sum over all partitions v obtained by adding one corner box to pp. We have 
discovered the simplest instance of the Pieri Rules mentioned at the beginning of §9.8. 
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9.10 The Bumping Comparison Theorem 


We now extend the analysis of the previous section to prove the general Pieri Rules for 
expanding s,h, and s,ex, in terms of Schur polynomials. The key idea is to see what 
happens when we successively insert k weakly increasing numbers (or k strictly decreasing 
numbers) into a semistandard tableau by repeated tableau insertion. We begin with some 
examples to build intuition. 


9.55. Example. Let T be the semistandard tableau shown here: 


Let us compute the tableaux that result by successively inserting the weakly increasing 
sequence of values 2,3,3,5 into T: 


Consider the four new boxes in the diagram of T, that are not in the diagram of 7’, which 
are marked by asterisks in the following picture: 


These boxes form what is called a horizontal strip of size 4, since no two new boxes are in 
the same column. Next, compare the bumping paths in the successive insertions of 2,3, 3, 5. 
We see that each path lies strictly right of the previous bumping path and ends with a new 
box in a weakly higher row. 

Now return to the original tableau J, and consider the insertion of a strictly decreasing 
sequence 5,4, 2,1. We obtain the following tableaux: 


S,=T<5 = 
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This time, each successive bumping path is weakly left of the previous one and ends in a 
strictly lower row. Accordingly, the new boxes in $4 form a vertical strip, where no two 
boxes occupy the same row: 


We now show that the observations in this example hold in general. 


9.56. The Bumping Comparison Theorem. Given a semistandard tableau T and 
values x,y € Z, let the new box in T < zw be (7,7), and let the new box in (T + x) + y be 
(r,s). (a)a<yiffi>randj <-s;(b)¢>yiffi<randj>s. 


Proof. It suffices to prove the forward implications, since exactly one of x < y or x > y 
is true. Let the bumping path for the insertion of x be ((1, 71), (2, J2),---,(4,9:)), where 
ji = j, and let the bumping sequence be (a = 21, %2,...,2;). Let the bumping path for the 
insertion of y be ((1, 51), (2, s2),.--, (7, Sr)), where s, = s, and let the bumping sequence be 
(Y = Y1, Yas -++) Yr): 

Assume x < y. We prove the following statement by induction: for all k with 1 <k <r, 
we have i > k and xy < yx and jp < 5%. When k = 1, we have i > 1 and 21 < y (by 
assumption). Note that 2; appears in box (1, 71) of T < a. We cannot have s; < jj, for this 
would mean that y; bumps an entry weakly left of (1,71), and this entry is at most 21 < y1, 
contrary to the definition of bumping. So 7; < s,;. Now consider the induction step. Suppose 
k <r and the induction hypothesis holds for k; does it hold for k +1? Since k < r, yx must 
have bumped something from position (k, s;) into the next row. Since jx < 8%, Z% must also 
have bumped something out of row k, proving that 7 > k +1. The object bumped by zx, 
namely x,41, appears to the left of the object bumped by yz, namely yx+1, in the same row 
of a semistandard tableau. Therefore, 7,41 < yzx41. Now we can repeat the argument used 
for the first row to see that jxi1 < sp41. Now that the induction is complete, take k = r 
to see that i > rand j = j; < jy < 5, = 8 (the first inequality holding since the bumping 
path for T < a moves weakly left as we go down). 

Next, assume z > y. This time we prove the following by induction: for all k with 
1<k<i, we haver>k and vy > yz and jp > sp. When k = 1, we have x) =a >y=y. 
Since x appears somewhere in the first row of T < x, y will necessarily bump something 
into the second row, so r > 1. In fact, the value bumped by y occurs weakly left of the 
position (1,71) occupied by x, so 8; < ji. For the induction step, assume the induction 
hypothesis is known for some k < i, and try to prove it fork +1. Since k <iandk <r, 
both x, and y, must bump elements out of row k into row k+1. The element y;,4, bumped 
by yx occurs in column s,, which is weakly left of the cell (k, j,) occupied by x, in T < a. 
Therefore, yp+1 < 2p, and x» is strictly less than x,41, the original occupant of cell (k, jx) 
in T. So te41 > Ye+i. Repeating the argument used in the first row for row k +1, we now 
see that yx+1 must bump something in row k + 1 into row k + 2 (so that r > k +1), and 
Sk+1 < Jr+i. This completes the induction. Taking k = 7, we finally conclude that r > i 
and 7 = 9; > 8; > Sp = 8. O 
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9.11 The Pieri Rules 


To state the Pieri Rules, we first formally define horizontal and vertical strips. 


9.57. Definition: Horizontal and Vertical Strips. A horizontal strip is a set of cells 
in Z2, no two in the same column. A vertical strip is a set of cells in Z2,, no two in the 
same row. 


9.58. Theorem: Inserting a Monotone Sequence into a Tableau. Let T be a semi- 
standard tableau of shape py, and let S be the semistandard tableau obtained from T by 
insertion of 21, z2,..., 2% in this order; we write S = (T ¢ 2129--+ 2) in this situation. Let 
v be the shape of S. (a) If 21 < z2 <-++- < zm, then dg(v) is obtained from dg(j1) by adding 
a horizontal strip of size k. (b) If 21 > zg >--- > zp, then dg(v) is obtained from dg(z) by 
adding a vertical strip of size k. 


Proof. For (a), assume z1 < 22 <--+ < zy. By the Bumping Comparison Theorem, the new 
boxes (71, 71),---;(¢k, Je) created by the insertion of 21,..., 2% satisfy j1 < Jo < +++ < Je. 
Thus, these boxes form a horizontal strip of size k. For (b), assume 21 > z2 > +++ > zg. In 
this case, the new boxes satisfy 71 < ig <--- < ix, so these boxes form a vertical strip of 
size k. O 


Since tableau insertion is reversible given the location of the new box, we can also reverse 
the insertion of a monotone sequence, in the following sense. 


9.59. Theorem: Reverse Insertion of a Monotone Sequence. Suppose yp and v are 
given partitions, and S is any semistandard tableau of shape v. (a) If dg(v)—dg() is a 
horizontal strip of size k, then there exists a unique sequence z1 < z2 <--- < z, and a unique 
semistandard tableau T of shape yu such that S = (T + 2122--- zz). (b) If dg(v)— dg(p) 
is a vertical strip of size k, then there exists a unique sequence z] > z2 >-:: > z, anda 
unique semistandard tableau T of shape p such that S = (T © 2122°-++ Zp). 


Proof. To prove the existence of T and 21,..., 2, in part (a), we repeatedly perform reverse 
tableau insertion, erasing each cell in the horizontal strip dg(v)— dg() from right to left. 
This produces a sequence of elements z,,..., 22, 2, and a semistandard tableau T' of shape 


p such that (T < 229---z,) = S. By comparing the relative locations of the new boxes 
created by z; and z;41, we see from the Bumping Comparison Theorem that z; < z;+41 for 
all 7. 

As for uniqueness, suppose T” and z} < 25 <--- < z/, also satisfy S = (I’ ¢ z124--+ z}). 
Since z{ < 24 <--- < 21, the Bumping Comparison Theorem shows that the insertion of 
Z4,+--,2, creates the new boxes in vy in order from left to right, just as the insertion of 
21,...,2k does. Write Tp = T, T; = (T < 2120--+%), T§ =T’, and T/ = (T" © 2,25--- 2). 
Since reverse tableau insertion produces a unique answer given the location of the new box, 
we see by reverse induction on i that T; = T/ and z; = z; fork >i > 0. 

Part (b) is proved similarly; here we erase cells in dg(v)— dg(j1) from bottom to top. O 


9.60. Theorem: Pieri Rules. Given an integer partition y and a positive integer k, let 
H;,() consist of all partitions v such that dg(v)— dg() is a horizontal strip of size k, and 
let Vi.(44) consist of all partitions v such that dg(v)— dg(js) is a vertical strip of size k. For 
every N > 0, there are weight-preserving bijections 


F : SSYTy(u) x SSYTy((k)) > (J SSYTw(v); 
ve, (1) 
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G : SSYTw(u) x SSYTy((1*)) > (J SSYTw(v). 
vE Vie (1) 
Consequently, we have the Pieri Rules in Ayn: 


Sphe = y $53 8yuek = y Ce 


ve Hy (p) vEVe(H) 


Proof. Recall that a semistandard tableau of shape (k) can be identified with a weakly in- 
creasing sequence 21 < z2 <--+ < zz of elements of [NV]. So, we can define F(T, 2122... 2%) = 
(T < 229-+++2z%). By Theorem 9.58, F' does map into the stated codomain. Theorem 9.59 
shows that F' is a bijection. Moreover, F’ is weight-preserving, since the content monomial 
of (T ¢ 2122°++ 2) is x7 Zz, 22, °° Ley. 

Similarly, a semistandard tableau of shape (1”) can be identified with a strictly increasing 
sequence y1 < y2 <-+-- < yg. Reversing this gives a strictly decreasing sequence. So we define 
G(T, yiy2---Yr) = (T < yr-+- yoy). As above, Theorems 9.58 and 9.59 show that G is a 
well-defined bijection. 

Finally, the Pieri Rules follow by passing from weighted sets to generating functions, 
applying the Sum and Product Rules for Weighted Sets, and using hy = s(,) and eg, = 
S(1k)- 

9.61. Example. We have 


8(4,3,1)h2 = 8(6,3,1) + §(5,4,1) + $(5,3,2) + $(5,3,1,1) + $(4,4,2) + $(4,4,1,1) + $(4,3,3) + $(4,3,2,1); 


as we see by drawing the following diagrams: 


Similarly, we find that 


$(2,2)€3 = §(2,2,1,1,1) 1 8(3,2,1,1) + §(3,3,1) 


by adding vertical strips to dg(2,2) as shown here: 


9.12 Schur Expansion of h, 


Iteration of the Pieri Rules lets us compute the Schur expansions of products of the form 
Spa as+**ha,, OV Sy€a,€az*** as, OF even mixed products involving both h’s and e’s. 
Taking 4 = 0, so that s, = 1, we obtain in particular the expansions of ha and eq into 
sums of Schur polynomials. As we will see, examination of these expansions leads to another 
appearance of the Kostka matrix. 
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9.62. Example. Let us use the Pieri Rule to find the Schur expansion of hy2,1,3) = hehihs. 
To start, recall that hz = s(2). Adding one box to dg(2) in all possible ways gives 


hgh, = §(3) + $(2,1)- 
Now we add a horizontal strip of size 3 in all possible ways to get 


hohyh3g = 8(3)h3 + S(2,1)hs 


[$(6) + 8(5,1) + 84,2) + 8(3,3)] + [8(5,1) + 8(4,2) + $(4,1,1) + $(3,2,1)] 


= 86) + 28(5,1) + 25(4,2) + $(4,1,1) + $(3,3) + 8(3,2,1)- 


Observe that each of the Schur polynomials s(5,1) and s(4,2) occurs twice in the final ex- 
pansion. Now, consider the computation of h2,3,1) = hgh3h1. Since multiplication of poly- 
nomials is commutative, this symmetric polynomial must be the same as h,21.3). But the 
computations with the Pieri Rule involve different intermediate objects. We initially calcu- 
late 

heh3 = s(2)h3 = 8(5) + $(4,1) + $(3,2)- 


Multiplying this expression by h, gives 


hoh3ghi = s5yhit Saryhi + $3,2)h1 
= [s) + 8(5,1)] + [8(5,1) + 84,2) + 8(4,1,1)] + [8(4,2) + 8(3,3) + $(3,2,1)], 


which is the same as the previous answer after collecting terms. As an exercise, the reader 
may compute h(3,.2.1) = h3hgh, and verify that the final answer is again the same. 


9.63. Example. We have seen that a given Schur polynomial may appear several times 
in the Schur expansion of ha. Is there some way to find the coefficient of a particular 
Schur polynomial in this expansion, without listing all the shapes generated by iteration of 
the Pieri Rule? To answer this question, consider the problem of finding the coefficient of 
8(5,4,3) When hi4,3,3,2) is expanded into a sum of Schur polynomials. Consider the shapes 
that appear when we repeatedly use the Pieri Rule on the product h4h3h3hg. Initially, we 
have a single shape (4) corresponding to h4. Next, we add a horizontal strip of size 3 in all 
possible ways. Then we add another horizontal strip of size 3 in all possible ways. Finally, 
we add a horizontal strip of size 2 in all possible ways. The coefficient we seek is the number 
of ways that the shape (5,4,3) can be built by making the ordered sequence of choices just 
described. For example, here is one choice sequence that leads to the shape (5, 4, 3): 


ECs HH 
+] *] *] +] repel 
Ao 


Here is a second choice sequence that leads to the same shape: 


Now comes the key observation. We have exhibited each choice sequence by drawing a 
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succession of shapes showing the addition of each new horizontal strip. The same information 
can be encoded by drawing one copy of the final shape (5,4,3) and putting a label in each 
box to show which horizontal strip caused that box to first appear in the shape. For example, 
the three choice sequences displayed above are encoded (in order) by the following three 
objects: 


These three objects are semistandard tableaux of shape (5,4,3) and content (4,3,3,2). By 
definition of the encoding just described, we see that every choice sequence under consid- 
eration is encoded by some filling of content (4,3,3,2). Since we build the filling by adding 
horizontal strips one at a time using increasing labels, it follows that the filling we get 
is always a semistandard tableau. Finally, we can go backward in the sense that any se- 
mistandard tableau of content (4,3,3,2) can be built uniquely by choosing a succession 
of horizontal strips telling us where the 1’s, 2’s, 3’s, and 4’s appear in the tableau. To 
summarize these remarks, our encoding scheme proves that the coefficient of 8(5,4,3) in the 
Schur expansion of hi4,3,3,2) 18 the number of semistandard tableaux of shape (5,4,3) and 
content (4,3,3,2). In addition to the three semistandard tableaux already drawn, we have 
the following tableaux of this shape and content: 


2/2/31 3] 212] 2[3| }2|2[2] 4] 


So the requested coefficient in this example is 6. 
The reasoning in the last example generalizes to prove the following result. 


9.64. Theorem: Schur Expansion of ha. Let a = (a1,Q2,...,@s) be any sequence of 
nonnegative integers with sum k. Then 


ha(a1,-..,0nN) = S- K).a8)(@1,---, UN). 
AEParn (k) 


(It is also permissible to sum over Par(k) here.) 


Proof. By the Pieri Rule, the coefficient of s, in hg is the number of sequences of partitions 


O=p Cur Cwc..-Cpue=dr (9.6) 


such that dg(u')—dg(y'—') is a horizontal strip of size a;, for 1 < i < s. (This is a formal 
way of describing which horizontal strips we choose at each application of the Pieri Rule 
to the product ha.) On the other hand, K),q is the number of semistandard tableaux of 
shape A and content a. There is a bijection between the sequences (9.6) and these tableaux, 
defined by filling each horizontal strip dg(y’)— dg(u*~') with a; copies of the letter 7. The 
resulting filling has content a and is a semistandard tableau. The inverse map sends a 
semistandard tableau T' to the sequence (i : 0 < i < s), where dg(p") consists of the cells 
of T containing symbols in {1,2,...,7}. oO 


9.65. Remark. Suppose a, 3 are sequences such that sort(a) = sort(@). Note that ha = 
hg since multiplication of polynomials is commutative. Expanding each side into Schur 
polynomials gives 


s Ky 08)(1,--.,0n) = Ss Ky, 88y(@1,..-, UN). 


A€Par(k) AEPar(k) 
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For N > k, the Schur polynomials appearing here are linearly independent by Theorem 9.43. 
So Kya = Ky, for all A, in agreement with Theorem 9.27. (This remark leads to an 
algebraic proof of Theorem 9.27, provided one first gives an algebraic proof of the linear 
independence of Schur polynomials.) 


9.66. Theorem: Complete Homogeneous Basis of ee For allk > 0 and N > 0, 
{hy(z1, Sats yin) De Pary(k)} 
is a basis of the vector space A‘. 


Proof. Consider the column vectors S = (s)(#1,...,un) : X € Parn(k)) and H = 
(hy(@1,-..,%N) : & € Parn(k)), where the entries are listed in lexicographic order. As 
in the proof of Theorem 9.43, let K = [K),,,] be the Kostka matrix with rows and columns 
indexed by partitions in Pary (k) in lexicographic order. Recall from Theorem 9.41 that K is 
a lower-triangular matrix with 1’s on the main diagonal. In matrix notation, Theorem 9.64 
asserts that H = K''S, where K" is the transpose of the Kostka matrix. This transpose 
is upper-triangular with 1’s on the main diagonal, hence is invertible. Since H is obtained 
from S by multiplying by an invertible matrix of scalars, we see that the elements of H 
form a basis by the same reasoning used in the proof of Theorem 9.43. O 


9.67. Remark. Combining Theorems 9.66 and 9.43, we can write H = (K'"K)M, where M 
is the vector of monomial symmetric polynomials indexed by Pary (k). This matrix equation 
gives the monomial expansion of the complete homogeneous symmetric polynomials h,,. 


DS 


9.13 Schur Expansion of e, 


Now we turn to the elementary symmetric polynomials eg. We can iterate the Pieri Rule 
as we did for ha, but here we must add vertical strips at each stage. 


9.68. Example. Let us compute the Schur expansion of €(2,2,2) = €2€2€2. First, e2e2 = 
§(1,1)€2 = $(2,2) + $(2,1,1) + $(1,1,1,1)- Next, 


e2e2€2 = [8(3,3) + 8(3,2,1) + $(2,2,1,1)] 
+[8(3,2,1) + §(3,1,1,1) + §(2,2,2) + $(2,2,1,1) + $(2,1,1,1,0)] 
+[82,2,1,1) + §(2,1,1,1,1) + $(1,1,1,1,1,1)] 
= (3,3) + 28(3,2,1) + §(3,1,1,1) + $(2,2,2) + 38(2,2,1,1) + 25(2,14) + $(18)- 
As in the case of ha, we can use fillings to encode the sequence of vertical strips chosen in 


the repeated application of the Pieri Rule. For example, the following fillings encode the 
three choice sequences that lead to the shape (2,2, 1,1) in the expansion of e(2,2,2): 


We see at once that these fillings are not semistandard tableaux. However, transposing 
the diagrams will produce semistandard tableaux of shape (2, 2,1, 1)’ = (4,2) and content 


(2, 2,2), as shown here: 
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This encoding gives a bijection from the relevant choice sequences to the collection of se- 
mistandard tableaux of this shape and content. So the coefficient of s(2,2,1,1) in the Schur 
expansion of €(2,2,2) is the Kostka number K(4,2),(2,2,2) = 3. This reasoning generalizes to 
prove the following theorem. 


9.69. Theorem: Schur Expansion of e,. Let aw = (a1,Q2,...,@;) be any sequence of 
nonnegative integers with sum k. Then 


€a(%1,.--,0nN) = a» Ky «8 (€1,---,0N) = > Ky aSv'(@1,..-,0N). 
A€EPar yn (k) veéPary(k)’ 


9.70. Remark. We have written Pary(k)’ for the set {X’ : X € Pary(k)}. Since conjugation 
of a partition interchanges the number of parts with the length of the largest part, we have 


Pary(k)! = {v € Par(k) : 1 < N} = {v € Par(k) : v; < N for alli > 1}. 


It is also permissible to sum over all partitions of k in the theorem, since this only adds zero 
terms to the sum. If the number of variables is large enough (N > k), then we are already 
summing over all partitions of k. 


9.71. Theorem: Elementary Basis of A‘,. For all k > 0 and N > 0, 
{e,(t1,...,@N): w € Parw(k)’} = {e,(a1,...,2N) : w € Parw(k)} 


is a basis of the vector space AG Consequently, the set of all polynomials ei ee ee where 
the i; are arbitrary nonnegative integers, is a basis of Ay. 


Proof. We use the same matrix argument employed earlier, adjusted to account for the 
shape conjugation. As in the past, let us index the rows and columns of matrices and 
vectors by the partitions in Par, (k), listed in lexicographic order. Introduce column vectors 
S = (s)(a1,...,un) : A € Parn(k)) and E = (e,/(a1,...,an) : pw € Pary(k)). Next, 
consider the modified Kostka matrix K whose entry in row yp and column X is K pal 
Theorem 9.69 asserts that 


Cy (1,-.-,2N) = [> Ey y8r(@1,..-, 2). 
AE Par n (k) 


By definition of matrix-vector multiplication, the equations just written are equivalent to 
E = KS. Since the entries of S are known to be a basis, it suffices (as in the proofs of 
Theorems 9.43 and 9.66) to show that K isa triangular matrix with 1’s on the diagonal. 
There are 1’s on the diagonal, since K,,,, = 1. On the other hand, using Theorem 9.41, 
Theorem 9.39, and Theorem 9.35, we have the implications 


K(j1,) 40> Ky yw 40> pw IN > AD p> A <aex bh 


So K is lower-triangular. 

Since the vector space Ay is the direct sum of its subspaces A‘, for k > 0, we get a basis 
for Ay by combining bases for these subspaces. The resulting basis of Ay consists of all e, 
with v € Par, meaning that v; < N for all i. If v has i; parts equal to j for 1 <j < N, 
the definition of e, shows that e, = ef eR vee et Thus we obtain the basis of Ay in the 
theorem statement. O 


9.72. Remark. Combining this theorem with Theorem 9.43, we can write E = (KK)M, 
where M is the vector of monomial symmetric polynomials indexed by Pary (k). This matrix 
equation gives the monomial expansion of the elementary symmetric polynomials e,,. 


388 Combinatorics, Second Edition 


DT 


9.14 Algebraic Independence 


We now use Theorem 9.71 to obtain structural information about the ring Aw of symmetric 
polynomials in N variables. First we need the following definition. 


9.73. Definition: Algebraic Independence. Let 2z1,...,2m be a list of polynomials 
in R[a1,...,ay]. We say that this list is algebraically independent iff the collection of all 
monomials 

fa? = 2 oe go tae ZS, 


is linearly independent. This means that whenever a finite linear combination of the mono- 
mials z* is zero, say 
y Coz” =0 (F finite, each ca € R), 


acF 
then cg = 0 for alla é F. 
9.74. Example. Consider the list 71,...,2, of all formal variables in the polynomial ring 
R[x1,...,n]. By the very definition of (formal) polynomials, if }).- 7 Cax® = 0, then every 
Ca is zero. So the formal variables x71,...,x2y are algebraically independent. On the other 


hand, consider the three polynomials 2] = 21 + %2, 22 = 2} + 73, and z3 = 23 + 23 in 
Ria, v2]. The polynomials 21, 22, 23 are linearly independent, as one may check. However, 
they are not algebraically independent, because of the relation 


ia; — 32122 + 2z3 = 0. 
Later, we show that 21, z2 is an algebraically independent list. 


Here is a more sophisticated way of looking at algebraic independence. Suppose 
21,--+,2m is any list of polynomials in B = R[a1,...,ay], and A = Ri[y,.-..,ym] is a 
polynomial ring in new formal variables y1,...,Y%m. There is an algebra homomorphism 
E:A-— B that sends f(y1,.--,Y¥m) € A to f(z,..-,2m) € B. (This means that FE pre- 
serves addition, ring multiplication, and multiplication by real scalars; see the Appendix for 
more discussion.) The map £ is called an evaluation homomorphism because it acts on an 
input f by evaluating each formal variable y; at the value z;. On one hand, the image of the 
homomorphism F is the subalgebra of B generated by 21,...,%m. On the other hand, the 
kernel of E consists of all polynomials f € A such that E(f) = 0. Writing f = >>, cay”, 
we see that E(f) = 0 iff 30, caz® = 0. 

Now suppose the given list 21,...,2m is algebraically independent. This means that 
yo, Coz* = 0 implies that all cy are 0, and hence f = 0. Thus, the kernel of E is zero, 
and EF is therefore one-to-one. The reasoning is reversible, so we conclude that algebraic 


independence of the list z1,...,2%m 1s equivalent to injectivity of the evaluation homomor- 
phism determined by this list. In this case, E is an algebra isomorphism from the formal 
polynomial ring R[y1,..., Ym] onto the subalgebra of R[x,,...,2y] generated by 21,..., 2m. 


We can use these comments to rephrase Theorem 9.71. That theorem states that the 
set of all monomials et vee ex forms a basis for Ay, where e; denotes the jth elementary 
symmetric polynomial in variables 71,...,2,. On one hand, the fact that these monomials 
span Ay means that Ay is the subalgebra of R[w1,...,2] generated by e1,...,en. On 
the other hand, the linear independence of the monomials means that e,,...,en is an 
algebraically independent list, so that the evaluation homomorphism F sending y; to e; 
is one-to-one. Thus, E : R[yi,...,yw] — Aw is an algebra isomorphism. The following 
theorem summarizes these structural results. 
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9.75. The Fundamental Theorem of Symmetric Polynomials. The elementary 
symmetric polynomials {e;(41,...,vn) : 1 <j < N} are algebraically independent. The 
evaluation map E : R{yi,...,yn] — Aw sending each y; to e; is an algebra isomorphism. 
So Aw is isomorphic to a polynomial ring in N variables. Moreover, for every symmetric 
polynomial f(#1,...,2.), there exists a unique polynomial g(yi,...,yn) such that f = 


E(g) = g(e1,..-, en). 


DS 


9.15 Power-Sum Symmetric Polynomials 


Recall that the power-sum symmetric polynomials in N variables are defined by 
pe(21,---,2n) = ak +ak+.. +20, for allk > Land pa(a1,...,a@n) = []j>1 Pa; (%1, 1.4, 0N). 
It turns out that the polynomials p),..., py are algebraically independent. One way to prove 
this is to invoke the following determinant criterion for algebraic independence. 


9.76. Theorem: Determinant Test for Algebraic Independence. Given a list 
gi,---,gn Of N polynomials in R[a1,...,ay], let A be the N x N matrix whose j, k-entry 
is the partial derivative Og,/0x,;, and let J € R[x1,...,2~] be the determinant of A. If 
J #0, then the list g1,..., gn is algebraically independent. 


Proof. We prove the contrapositive. Assume the list g1,...,gn is algebraically dependent. 
Then there exist nonzero polynomials h € R[yi,..., yn] such that h(g1,...,gn) = 0. Choose 
such an h whose degree (in the y-variables) is as small as possible. We can find the partial 
derivative of h(gi,...,gn) with respect to each variable z; by applying the multivariable 
chain rule. We obtain the N equations 


“Oh a 
) 5, gir aii 26 forl<j<QN. (9.7) 
= OUR Ox; 

Let v be the column vector whose kth entry is Se (a, ...;gn). The equations in (9.7) are 


equivalent to the matrix identity Av = 0. We now show that v is not the zero vector. 
Note that h € Rliyi,...,yn] cannot be a constant polynomial, since h # 0 but 
h(gi,---,gn) = 0. So at least one partial derivative of h must be a nonzero polynomial 
in R[yi,..., yn]. For any k with on ~ # 0, the degree of os in the y-variables is lower than 
the degree of h. By choice of h, it ‘filows that ae Bye (91,--++9N) is nonzero in R[yi,..., yn]. 


This polynomial is the kth entry of v, so v 4 0. Now. Av = 0 forces A to be non-invertible, 
so that J = det(A) = 0 by a theorem of linear algebra. Oo 


9.77. Remark. The converse of Theorem 9.76 is also true: if the list g1,...,gn is alge- 
braically independent, then the Jacobian J is nonzero. We do not need this fact, so we omit 
the proof; see [64, §3.10]. 


9.78. Theorem: Algebraic Independence of Power-Sums. The power-sum polyno- 


mials {pz(a1,...,an):1<k < N} are algebraically independent. 
Proof. We apply the determinant criterion from Theorem 9.76. The j, k-entry of the matrix 
A is e 9 
Se = Bet tok 4... 4 at fio dal wk) = kal, 
j ‘gj 


Therefore, J = det [kat i<j ncn: For each column k, we may factor out the scalar k to see 
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that J = N! det [a —"}. The resulting determinant (after reversing the order of the columns) 
is called a Vandermonde determinant. This determinant evaluates to + [],<,c,<n(@r —2s), 
which is a nonzero polynomial (see §12.11 for a combinatorial proof of this formula). We 
conclude that J 4 0, which proves the result. oO 


Now that we know that the list p1,...,py is algebraically independent, we can obtain 
power-sum bases for the vector spaces Af, and Ay. 


9.79. Theorem: Power-Sum Basis. For all k > 0 and N > 0, 


{pu(x1,..-,0n) 1 w € Parn(k)’} 


is a basis of the vector space AX,. The collection {pi} . pw :41,...,¢n > O} is a basis of 
the vector space Ay. The evaluation map E : R[y1,...,ynw] + Aw sending each y; to p; is 
an algebra isomorphism. So, for every symmetric polynomial f(a1,...,a,), there exists a 


unique polynomial g(y1,...,yn) such that f = E(g) = g(pi,.-.-, pn). 


9.80. Remark. As mentioned earlier, everything said here is valid if we replace R by any 
field K containing Q. But, the results in this section fadl if we use coefficients from a field 
K of characteristic p > 0. For example, if char() = 3, then pi, p2,p3 is an algebraically 
dependent list in K[x1, 22,23]. To see why, note that g(y1, ye, y3) = y? — y3 is nonzero in 
K[y1, ya, y3], but g(pi,p2,p3) = (41 + x2 + 23)? — (x? + 23 + 23) is zero in K[x1, 22, 23]. 
This follows by expanding (x1 +22 +23) using the Multinomial Theorem and noting that 
all terms other than x + x3 + x3 are multiples of 3= 1K +1K+1K =O. 


a 


9.16 Relations between e’s and h’s 


We have seen that the lists e1,...,ey and pi,...,pn are algebraically independent in 
Riv,...,n]. The reader may wonder whether the list hi,..., hy is also algebraically inde- 
pendent. This fact would follow (as it did for e1,..., en) if we knew that {h,, : pp € Parn(k)’} 
was a basis of Ae for all k > 0. However, the basis we found in Theorem 9.66 was 
{hy : & € Pary(k)}, which is indexed by partitions of k with at most N parts, instead 
of partitions of k with each part at most N. The next result allows us to overcome this 
difficulty by providing equations relating e,,...,en to hi,..., hn. 


9.81. Theorem: Recursion involving e; and h;. For all m > 0 and N > 0, 


m : 1 ifm=0; 
me e;(@1,...,£N)Mm_i(21,.-.,2N) = x(m = 0) ={ a ae. (9.8) 


Proof. If m = 0, the identity becomes 1 = 1, so let us assume m > 0. We can model the left 
side of the identity using a collection Z of signed weighted objects. A typical object in Z is 
a triple z = (i, S,T), where 0 < i < m, S € SSYTy((1*)), and T € SSYTy((m — i)). The 
weight of (i,$,T) is x°x", and the sign of (i,$,T) is (—1)'. For example, taking N = 9 
and m = 7, a typical object in Z is 


a= (s Ey [3 [3] 416] : 
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The signed weight of this object is (—1)?(aer4x7)(a3r4%6) = —X2x3x72627. Recalling that 
€; = $4) and hm_j = 8(m—i), we see that the left side of (9.8) is precisely 

> sgn(z) wt(z). 

2EZ 


To prove this expression is zero, we define a sign-reversing, weight-preserving involution 
I: Z > Z with no fixed points. Given z = (i, S,T) € Z, we compute I(z) as follows. Let 
j = S(1,1) be the smallest entry in S, and let k = T(1,1) be the leftmost entry in T. If 
i = 0, then S$ is empty and J is undefined; if 7 = m, then T is empty and & is undefined. 
Since m > 0, at least one of j or k is defined. If 7 < k or k is not defined, move the box 
containing 7 from S' to T, so that this box is the new leftmost entry in JT, and decrement 
i by 1. Otherwise, if k < j or j is not defined, move the box containing k from T to S$, so 
that this box is the new topmost box in S, and increment i by 1. For example, if z is the 
object shown above, then 


4 
na) = (2, A BESTE). 
As another example, 


(0, 9, (2[2[3[5]5[7[9]) = 0, 2}, 2[315]5[7]9]).- 


From the definition of J, we can check that I does map Z into Z, that Io I = idz, that I 
is weight-preserving and sign-reversing, and that J has no fixed points. O 


9.82. Theorem: Complete Homogeneous Basis. For all k > 0 and N > 0, 
{hy(x1,...,@N) : w € Pary(k)’} 


is a basis of the vector space Ae. The collection {hi ee hee :41,..-,¢n > 0} is a basis of 
the vector space Ay, and hy,...,hy is an algebraically independent list. The evaluation 
map E: R[y,...,yn] + Aw sending each y; to h; is an algebra isomorphism. So, for every 
symmetric polynomial f(x1,...,7y), there exists a unique polynomial g(y1,...,yn) such 


that f = E(g) = g(h1,--., hw): . 


Proof. It suffices to prove the statement about the basis of A%,, from which the other 
assertions follow. Since | Pary(k)’| = | Pary(k)|, we have 


{hy (vi,...,2n) : w © Parn(k)’}| < |{hy(21,...,2N) : w € Parw(k)}I, 


where the right side is the dimension of Ak, by Theorem 9.66. (Strict inequality could occur 
if h, = hy for some pw # v in Pary(k)’.) By a theorem of linear algebra, it is enough to 
prove that {h, :  € Pary(k)’} spans the entire subspace A‘,. For each k > 0, let WK be 
the vector subspace of A‘, spanned by h, with  € Pary(k)’. We must prove WK = AX, for 
all k. It suffices to show that ej! --- eh © W* for all i1,...,in that sum to k, since these 
elementary symmetric polynomials are known to be a basis of A‘,. Now, it is routine to 
check that f € Wk and g € Wi? imply fg € WK*™. (This holds when f and g are products 
of hi,...,hn, and the general case follows by linearity and the Distributive Law.) Using 
this fact, we can further reduce to proving that e;(a1,...,un) € Wy for l<j<N. 

We prove this by induction on j. The result is true for 7 = 1, since ey = ey Le = 
hye Wh. Assume 1 < j < N and the result is known to hold for all smaller values of 7. 
Taking m = j in the recursion (9.8), we have 


ej = ej—hy _ ej—-2h2 + ej—3hg ses eyhj-1 +- hj. 
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Since e;_, € Wi * (by induction) and h, € Wg, (by definition) for 1 < s < j, each term on 
the right side lies in W,. Since Wx, is a subspace, it follows that e; € W;,, completing the 
induction. O 


9.17 Generating Functions for e’s and h’s 
Another approach to the identity (9.8) involves generating functions. 
9.83. Definition: Ey(t) and Hy(t). For each N > 1, define 

N 

[[@+22, #v®=]] : 


: * 1- ait 
i=l 4=1 


& 
a 
I 


9.84. Remark. Ey is an element of the polynomial ring K[t], where K = R(a,...,xrn) 
is the field of rational functions (formal ratios of polynomials) in x1,...,2y. Similarly, Hy 
is an element of the formal power series ring K'[[é]]. 


9.85. Theorem: Expansion of F(t). For all N > 1, 


N 


Ey@) => a@ps yey). 


k=0 


Proof. We can use the Generalized Distributive Law (see Exercises 2-16 and 4-68) to expand 
the product in the definition of E(t). For each i between 1 and N, we choose either 1 or 
x,t from the factor 1+ a;t, and then multiply all these choices together. We can encode 


each choice sequence by a subset S$ of {1,...,N}, where 7 € S iff x;t is chosen. Therefore, 
N 
En(t)=[[G+2t)= SY) J] (@t)]]1. 
i=1 SC{1,2,...,.N} ies igs 


To get terms involving t*, we must restrict the sum to subsets 9 of size k. Such subsets can 
be identified with increasing sequences 1 < 71 < ig <+++< ix < N. Therefore, for all k > 0, 
the coefficient of t* in Ey (t) is 


Vj, Vig (Ti, = ex (21, oe ,0N). 
1<i1 <ig<:--<4, SN 


Note that this coefficient is zero when k > N. O 


Here is a famous algebraic application of elementary symmetric functions, which explains 
the relation between the roots and the coefficients of a polynomial in one variable. 


9.86. Theorem: Roots and Coefficients of a Polynomial. Suppose a polynomial 
p(X) = XN +a,XN-14.-.-+a,XN-*+---+ay_1X + an € R[X] can be factored as 
p(X) = (X — 1r1)(X — 1r2)-+-(X — ry) for some r1,...,7~ € R. For all i in the range 
1<i<QN, 


a; = (-1)'es(r,-.., rN). 
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Proof. This can be proved by expanding i eare.4 —r;) using the Generalized Distributive 
Law, as in the preceding proof. Alternatively, we can deduce the result from Theorem 9.85 
as follows. Replacing t by 1/X and x; by —r; in Ey(t) gives iM esiere —r;,/X) = X—Nop(X). 
Using Theorem 9.85, we conclude that 


n N 
p(X) = XN YS" ex (—ri,...,-rw)X—* = SO (-1Fex(ri,... rw) XN*. 
k=0 k=0 
Taking the coefficient of X‘~* gives the result. O 


9.87. Theorem: Expansion of Hy(t). For all N > 1, 
H(t) = bau or pat ane. 
k=0 


Proof. Using a formal version of the Geometric Series Formula (see §11.3 for a rigorous 
development), we have 


Next, using a generalization of the distributive law to finite products of formal power series 
(see Theorem 5.35), we get 


N 
Hn@)= >) [ah = SD tte tivattap af. 


(515--,5n )EZX, t= 1 (j15---.5n )EZNg 
The coefficient of t* consists of the sum of all possible monomials in x7 ,...,2y of degree 
k, which is precisely hy(a1,...,@N). O 


Now we can give an algebraic proof of Theorem 9.81. For each N > 0, 
N N 


aye) = | —— 5 [a= et) = (9.9) 


i=l (1 — wit i=l 


Equating the coefficients of t’” on both sides gives (9.8). 


9.18 Relations between p’s, e’s, and h’s 


In this section, we study recursions similar to (9.8) that relate the complete and elementary 
symmetric polynomials to the power-sum symmetric polynomials. These recursions can be 
used to deduce the algebraic independence of p;,...,py from the algebraic independence 
of hi,..., hy (or e1,...,en) by adapting the proof of Theorem 9.82 to the new recursions. 


9.88. Theorem: Recursion involving h; and p;. For alln, N > 1, the following identity 
is valid in Ayn: 
hopPn lr hipn—1 + hopn—2 aie Shs An—1P1 = Ny. (9.10) 
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Proof. Let us interpret each side of the equation as the generating function for a collection of 
weighted objects. For the left side, let X be the set of all triples (k,T,U), where:0 <k < n; 
T € SSYTw((k)); and U consists of a row of n — k boxes all filled with the same integer 
i € [N]. The weight of such a triple is x?x” = x7a"~*. For example, letting n = 8 and 
N =9, here is a typical object in X of weight x7x2r327: 


3 
zo = (5, [1] 1] 2] 4]4} ). 


For a fixed value of k, the generating function for the possible T’s is hy(a1,...,0N), and 
the generating function for the possible U’s is pp—z(v1,...,uN). By the Sum and Product 
Rules for Weighted Sets, the left side of (9.10) is the generating function for X. 

Now let Y be the set of all pairs (V, 7), where V € SSYTy((n)) and 1 < 7 <n. We can 
visualize an object in Y as a semistandard tableau of shape (n) in which the jth cell has 
been marked. For example, here is a typical object in Y of weight x7x3.4: 


yo =(1]1]3)3*]3]3]3] 4} 


The generating function for the weighted set Y is nh,(a1,...,2yN). 

To prove (9.10), it suffices to define a weight-preserving bijection f : X — Y. Given 
(k,T,U) € X, note that U consists of a run of n — k copies of some value i. To compute 
f(k,T,U), mark the first box in U and splice the boxes of U into T in the appropriate 
position to get a weakly increasing sequence. If T’ already contains one or more 2’s, the first 
box of U is inserted immediately after these 7’s. For example, the triple z9 above maps to 


f(zo) = [1 [1] 2)8*13] 3/4] 4} 


This insertion process is reversible, thanks to the marker. More precisely, define g: Y > X 
as follows. Given (V,7) € Y, let 7 be the entry in the jth cell of V. Starting at cell 7 and 
scanning right, remove each cell equal to i from V to get a pair of tableaux T and U as in 
the definition of X. Define g(V,7) = (k,T,U), where & is the number of boxes in T. For 
example, the object yo above maps to 


g(yo) = (4, [13 14}, ). 


One may check that f and g are weight-preserving functions that are two-sided inverses of 
each other. O 


9.89. Theorem: Recursion involving e; and p;. For alln, N > 1, the following identity 
is valid in Ayn: 


€0Pn — €1Pn—1 + €2Pn—2 — ++ + (—1)"ten_1pr = (—1)" 1 nen. (9.11) 


Proof. This time we interpret each side of the equation using signed weighted objects. For 
the left side, let X be the set of all triples (k,T,U), where: 0 < k <n; T € SSYTy((1*)); 


and U consists of a row of n — k boxes all filled with the same integer j € [N]. The weight 


of this triple is x’x, and the sign of this triple is (—1)*. For example, here is a typical 
g y 


object in X whose signed weight is (—1)*x2arfa527: 


Using the Sum and Product Rules for Weighted Sets, one sees that }7,-. sgn(z) wt(z) is 
the left side of (9.11). 
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Now let Y = {(T7,j) : T € SSYTw((1")), 1 < 7 < n}. We can think of each element of 
Y asa strictly increasing sequence of n elements of {1,2,...,N} in which the jth element 
has been marked. The generating function for the weighted set Y is ne,(x1,...,2yN). 

Let us define a weight-preserving, sign-reversing involution I: X > X. Fix (k,T,U) € 
X. Since k <n, U is not empty; let 7 be the integer appearing in each box of U. The map 
TI acts as follows. On one hand, if k <n-—1 and j does not appear in 7’, then increase k by 
1, remove one copy of 7 from U, and insert this number in the proper position in T to get a 
sorted sequence. On the other hand, if 7 does appear in T’, then decrease k by 1, remove the 
unique copy of 7 from 7’, and place another copy of 7 in U. If neither of the two preceding 
cases occurs, (k,T,U) is a fixed point of I. For example, 


I(z0) = (s Fy 
7 


It can be checked that J is a well-defined, weight-preserving, sign-reversing involution on 
Xx. 

Let Z be the set of fixed points of J. We see from the description of J that Z consists of 
all triples (n—1,T,[j]) where j does not appear in 7’. All of these triples have sign (—1)"~. 
The proof will be complete if we can find a weight-preserving bijection g : Z — Y. We define 
g by inserting a marked copy of 7 into its proper position in the increasing sequence T’. The 
inverse map takes an increasing sequence of size n with one marked element and removes 
the marked element. For example, 


4, LB = = 4, 2 1 O 
g Pp 3 — Top LL] - 
Bi s 9) 


9.19 Power-Sum Expansions of h,, and e, 


We can use the recursions in Theorems 9.88 and 9.89 to compute expansions for h, and en 
in terms of the power-sum symmetric polynomials p,,. 


9.90. Example. We know that ho = 1 and h; = p,. Next, since hop2 + hip, = 2h2, we 
find that hz = (pi2) + pa,1))/2. For n = 3, we have 


hops + hip2 + hep = dhs, 
so that 


1 po + pe 
h3 = 3 | Ps + Pipa + | >— | Pr | = (1/3)p(3) + (1/2) p(2,1) + (1/6) p11): 


For n = 4, we use the relation 
hopa + hip3 + hap2 + h3pi = 4h 
to find, after some calculations, 


ha = (1/4)pcay + (1/3) p(3,1) + 1/8) P(2,2) + (1/4) P(2,1,1) + 1/24) p1,1,1)- 
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We can eliminate the fractions in the formula for h, by multiplying both sides by n!. For 
instance, 


3!hg = 23) + 3p(2,1) + 1p(00,1,1); 
Mhg = 6pcay + 8p(3,1) + 8p (2,2) + 6P2,1,1) + 1P(4,1,1,1)- 


Similar formulas can be derived for n!e,, but here some signs occur. For instance, calcula- 
tions with (9.11) lead to the identities 


3les3 = = 2p(3) — 8p(2,1) + 1p); 
Meg = —6pa) + 8p(3,1) + 8P(2,2) — 6P2,1,1) + 1P,1,1,1)- 


Notice that the coefficients in the power-sum expansion of 4!h4 match the entries in 
Table 7.1. This suggests the following result. 


9.91. Theorem: Power-Sum Expansion of h,. For all positive integers n and N, the 
following identity is valid in Ay: 


Min = S> (n!/2u)Pu- (9.12) 


uEPar(n) 


Proof. Recall from Theorem 7.115 that n!/z, is the number of permutations w € 5, with 
cycle type p. This leads to the following combinatorial interpretations for the two sides 
of (9.12). The left side counts all pairs (w,T), where w = wi w2:::Wn € Sp is a permutation 
written in one-line form and T = (i1 < ig < +++ <%,) is an element of SSYTw((n)). Let X 
be the set of all such pairs, with wt(w,7) =x’. For example, here is a typical element of 
X when n = 8, written as a two-rowed array: 
_ | w: 
al 
The right side of (9.12) counts all triples (u,o,C), where  € Par(n), 0 € Sy isa 
permutation with cycle type uw, and C’: {1,2,...,n}— {1,2,...,N} is a coloring of the 
numbers 1,...,n using N available colors such that all elements in the same cycle of o are 
assigned the same color (cf. §7.16). Let the weight of (u,0,C) be [];_, Zc (kz), and let Y be 
the set of all such weighted triples. For example, a typical element of Y is shown here: 


123 4 5 6 7 8 
w= (2,21), ,6.32,17,98) | § > 3 3233 3): 


To see why the factor p,(v1,...,2n) arises, consider how we may choose the coloring 
function C' once yp and o have been selected. We know o is a product of cycles of lengths 
[1, [2,---,/41. Choose the common color of the elements in the first cycle in any of N 
ways. Since j1; elements all receive the same color, the generating function for this choice is 
ay taht +--+ +a = py, (a1,--.,2n). Next, choose the common color of the elements in 
the second cycle, which gives a factor of p,,., and so on. Multiplying the generating functions 
for these choices gives p,,(@1,...,UnN). 

To complete the proof, we define weight-preserving maps f: Y — X andg: X ~ Y 
that are inverses of each other. To understand the definition of f, recall that a given a € S;, 
can be written in cycle notation in several different ways, since the cycles can be presented 
in any order, and elements within each cycle can be cyclically permuted. Given (1, 0,C), 
we specify one particular cycle notation for o that depends on C, as follows. First, cycles 
colored with smaller colors are written before cycles colored with larger colors. Second, 
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elements within each cycle are cyclically shifted so that the first element in each cycle is 
the smallest element appearing in that cycle. Third, if several cycles have the same color, 
then these cycles are ordered so that their minimum elements decrease from left to right. 
For example, starting with the object yo above, we obtain the following cycle notation for 
a: (2,5)(8)(4, 7)(1, 6,3). Note that (2,5) is written first because this cycle has color 2. The 
other cycles, which are all colored 3, are presented in the given order because 8 > 4 > 1. 
Finally, to compute f(:,0,C), we erase the parentheses from the chosen cycle notation for 
a and write the color C(i) directly beneath each 7 in the resulting word. For example, 


sas w: 2584741 63 
wy | FD S| 3.9 9.9 3" 


It can be checked that f is well-defined, maps into X, and preserves weights. 

Now consider how to define the inverse map g : X — Y. Given (w,T) € X with 
w=wi-:-wW, and T =i; <--- < in, the coloring map C is defined by setting C(w,;) = i; 
for all 7. To recover o from w and T, we need to add parentheses to w to recreate the 
cycle notation satisfying the rules above. For each color 7 in turn, look at the substring 
of w consisting of the symbols located above the 2’s in T. Scan this substring from left to 
right, and begin a new cycle each time a number smaller than all preceding numbers in 
this substring is encountered. (The numbers that begin new cycles are called left-to-right 
minima relative to color i.) This procedure defines o, and finally we set 4 = type(c). For 
example, for the object zo above, 


glia) = (@.2,1,1,1, ), (4) (2, 5)(8)(3, 7)(1)(6), ; i ; : i : ; |) 


We find that g(f(yo)) = yo and f(g(zo)) = zo. The reader may similarly verify that go f = 
idy and f og = idx, so the proof is complete. O 


Before considering the analogous theorem for e,, we introduce the following notation. 
9.92. Definition: The Sign Factor ¢,,. For every partition  € Par(n), let 


em) 


en = (Hr) = T(t. 


i=1 
We proved in Theorem 7.34 that ¢,, = sgn(c) for all o € S,, such that type(o) = p. 


9.93. Theorem: Power-Sum Expansion of e,,. For all positive integers n and N, the 
following identity is valid in Ay: 


nlen = Ey (1!/ 2.) Py: (9.13) 


pee Par(n) 


Proof. We use the notation X, Y, f, g, zo, and yo from the proof of Theorem 9.91. We 
saw in that proof that >7)cpar(n)(™!/Zn)Pp is the generating function for the weighted set 
Y. To model the right side of (9.13), we need to get the sign factors ¢,, into this sum. We 
accomplish this by assigning signs to objects in Y as follows. Given (f1,0,C) € Y, write 
o in cycle notation as described previously. Attach a + to the first (minimum) element of 
each cycle, and attach a — to the remaining elements in each cycle. The overall sign of 
(u,0,C) is the product of these signs, which is [],(—1)“*~* = e,. For example, the object 
yo considered previously is now written 


| i234 5 6 7 4 
wo = (8.2.2.1), ee ee ee ee 31); 
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the sign of this object is (—1)* = +1. 

The next step is to transfer these signs to the objects in X using the weight-preserving 
bijection f : Y > X. Given (w,T) € X, find the left-to-right minima relative to each color 
i (as discussed in the definition of g in the proof of Theorem 9.91). Attach a + to these 
numbers and a — to all other numbers in w. For example, f(yo) is now written 


Foo) =| & of 5= BF gh 7 ah pe ae 
POP") Gee  9- OB Bee Sg ¢ ll4 


As another example, 


_ | ee 4 OF BS a Be a Ge 
eel | a es re a: a: oe ce: | 


The bijections f and g now preserve both signs and weights, by the way we defined signs 
of objects in X. It follows that 7, sgn(z) wt(z) = >0,¢y sgn(y) wt(y), and the sum over 
Y is the right side of (9.13). 

Now we define a sign-reversing, weight-preserving involution I : X > X. Fix (w,T) € X. 
If all the entries of T are distinct, then (w, 7) is a fixed point of I. We observe at once that 
such a fixed point is necessarily positive, and the generating function for such objects is 
nlen(a1,...,¢N). On the other hand, suppose some color i appears more than one time in 
T. Choose the smallest color 2 with this property, and let wz, wz41 be the first two symbols 
in the substring of w located above this color. Define I(w,T) by switching wz and wz41; 
one checks that this is a weight-preserving involution. Furthermore, it can be verified that 
switching these two symbols changes the number of left-to-right minima (relative to color 
i) by exactly 1. For example, 


yeY 


w: BE ot gt at 7 It 6 Bq 
glue) =| 5-25 9 & 3 3 5 | 


As another example, 


oe ws: 2 4A- 5- Bt 83 F 1* Gt 
oe (rs 1 ft ft @ 2 & oS si 


In general, note that wz, is always labeled by +; wz+1 is labeled + iff wy, > we+1; and the 
signs attached to numbers following wz+4; do not depend on the order of the two symbols 
Wr, We4+1- We have now shown that I is sign-reversing, so the proof is complete. O 


DS 


9.20 The Involution w 


Recall from §9.15 that the algebra Ay is generated by the algebraically independent list 
of power-sums pi,...,pn- So the polynomial ring R[yi,...,yn] is isomorphic to Ay via 
the evaluation map sending y; to p;(1,...,%n). Because of this isomorphism, we can 
forget about the original variables 71,...,2,) and regard the symbols p),...,pn as formal 
variables, writing Ay = R[pi,...,pn]. We can then define evaluation homomorphisms with 
domain Ay by sending each p; to an arbitrarily chosen element a; in a given algebra over R. 
It follows that for all partitions 4, and polynomials f, p, maps to [],a,, and f(pi,.-.,pn) 
maps to f(a1,...,@y) under this homomorphism. The next definition uses this technique 
to define a homomorphism from Ay to itself. 
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9.94. Definition: The Map w. Let w: Ay — Ay be the unique algebra homomorphism 
such that w(p;) = (—1)’~'p,; for all j between 1 and n. 


9.95. Theorem: Properties of w. Let v be a partition with 4, < N. (a) wow = ida,y, 
so w is an isomorphism. (b) w(p,) = evpy. (c) w(hv) = ev. (d) w(e,) = hp. 


Proof. (a) Observe that wo w(p;) = w((—1))~!p;) = (—1)9-1(-1))"!p; = p; = id(p;) for 
1<j< N. Since wow and id are algebra homomorphisms with domain Ay that have 
the same effect on every generator p;, these two maps are equal. Thus, w has a two-sided 
inverse (namely, w~! = w), so w is an isomorphism. 
(b) The homomorphism w preserves multiplication, so 
2) (v) ev) 
w(py) =W [[». = II w (Dy; ) _ [pp = €vypv- 
i=1 i=1 


i=1 


(c) For 1 <n < N, we use Theorems 9.91 and 9.93 and part (b) to compute 


w(hn) =w + 2, tig = s Zz, W(DPy) = > a, Dy = Sy 


poe Par(n) pwePar(n) pe Par(n) 


Since w preserves multiplication, w(h,) = e, follows. 
(d) Part (d) follows by applying w to both sides of (c), since wow = id. O 


The next theorem shows how w acts on the Schur basis. 
9.96. Theorem: Action of w on s). If \ € Par(n) and n < N, then w(s)) = sy in An. 


Proof. From Theorem 9.64, we know that for each y € Par(n), 


Ay(t1,...,0n) = Ss Ky $)(@1,-.-,0N). 
A€EPar(n) 


We can combine these equations into a single vector equation H = K"S using column 
vectors H = (h,, : p € Par(n)) and S = (s) : \ € Par(n)). Since K™ (the transpose of the 
Kostka matrix) is unitriangular and hence invertible, S = (K")~1H is the unique vector v 
satisfying H = Kv. 

From Theorem 9.69, we know that for each ps € Par(n), 


€n(t1,.-.,0nN) = ee Ky $8) (t1,...,0N). 
AE Par(n) 
Applying the linear map w to these equations produces the equations 


Ry = Ey, (8°). 


A€Par(n) 


This says that the vector v = (w(s)’) : A € Par(n)) satisfies H = K‘'v. By the uniqueness 
property mentioned above, v = S. So, for all A € Par(n), 8, = w(sy’). Replacing A by 
(or applying w to both sides) gives the result. O 


What happens if we apply w to the monomial basis of Ay? Since w is a linear map, 


we get another basis of Ay that turns out to be different from those discussed so far. This 
basis is hard to describe directly, so it is given the following name. 


9.97. Definition: Forgotten Basis for Ay. For each A € Pary, define the forgotten 
symmetric polynomial fgt, = w(m ). The set {fgt, : \ € Pary(k)} is a basis of A. 
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9.21 Permutations and Tableaux 


Iteration of the tableau insertion algorithm (§9.8) leads to some remarkable bijections that 
map permutations, words, and matrices to certain pairs of tableaux. These bijections were 
studied by Robinson, Schensted, and Knuth, and are therefore called RSK correspondences. 
We begin in this section by showing how permutations can be encoded using pairs of stan- 
dard tableaux of the same shape. 


9.98. Theorem: RSK Correspondence for Permutations. For all n > 0, there is a 
bijection RSK : Sn + Uyepar(n) SYT(A) x SYT(A). Given RSK(w) = (P(w), Q(w)), we call 
P(w) the insertion tableau for w and Q(w) the recording tableau for w. 


Proof. Let w € Sy, have one-line form w = w w2---:w,. We construct a sequence of tableaux 
Po, Pi,..., Pn = P(w) and a sequence of tableaux Qo,Qi,---,Qn = Q(w) as follows. 
Initially, let Po and Qo be empty tableaux of shape (0). Suppose 1 <7 <n and P,_-1,Q;-1 
have already been constructed. Define P; = P;_1 < w; (the semistandard tableau obtained 
by insertion of w; into P;). Let (a, b) be the new cell in P; created by this insertion. Define Q; 
to be the filling obtained from Q;-1 by placing the value i in the new cell (a, b). Informally, 
we build P(w) by inserting wi,...,Wp (in this order) into an initially empty tableau. We 
build Q(w) by placing the numbers 1,2,...,n (in this order) in the new boxes created by 
each insertion. By construction, Q(w) has the same shape as P(w). Furthermore, since the 
new box at each stage is a corner box, one sees that Q(w) is a standard tableau. Since w is 
a permutation, the semistandard tableau P(w) contains the values 1,2,..., once each, so 
P(w) is also a standard tableau. We define RSK(w) = (P(w), Q(w)). 

To see that RSK is a bijection, we present an algorithm for computing the inverse map. 
Let (P,Q) be any pair of standard tableaux of the same shape A € Par(n). The idea is 
to recover the one-line form w,--: wp in reverse by uninserting entries from P, using the 
entries in Q to decide which box to remove at each stage (cf. §9.9). To begin, note that n 
occurs in some corner box (a,b) of Q (since Q is standard). Apply reverse insertion to P 
starting at (a,b) to obtain the unique tableau P,,-; and value w, such that P,-1 < wp, is 
P with new box (a,b) (see Theorem 9.54). Let Qn—1 be the tableau obtained by erasing n 
from Q. Continue similarly: having computed P; and Q; such that Q; is a standard tableau 
with i cells, let (a,b) be the corner box of Q; containing i. Apply reverse insertion to P; 
starting at (a,b) to obtain P;_; and w;. Then delete i from Q; to obtain a standard tableau 


Q;-1 with i—1 cells. The resulting word w = w w2:-- Wp is a permutation of {1,2,...,n} 
(since P contains each of these values exactly once), and our argument has shown that w 
is the unique object satisfying RSK(w) = (P,Q). So RSK is a bijection. O 


9.99. Example. Let w = 35164872 € Sx. Figure 9.1 illustrates the computation of 
RSK(w) = (P(w),Q(w)). As an example of the inverse computation, let us determine 
the permutation v = RSK~'(Q(w), P(w)) (note that we have switched the order of the 
insertion and recording tableaux). Figure 9.2 displays the reverse insertions used to find 
Un; Un—1,-++,U1. We see that v = 38152476. 


Let us compare the two-line forms of w and v: 
123 4 5 67 8]. a 123 4 5 67 8 
3.5 16 4 8 7 2 |’ “13 8 15 2 4 7 6 


i 


We notice that v and w are inverse permutations. 
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Insertion Tableau 


insert 3: 
insert 5: 
insert 1: 
; [1] 5]6] 
insert 6: 
insert 4: ee 
a [1] 4[6]8] 
insert 8: 
1/4|6|7| 
insert 7: oO 

[315 ]8| 

[1] 2[6|7] 
insert 2: 314] 8| 


fox] ce] 


FIGURE 9.1 
Computation of RSK (35164872). 


Insertion Tableau 


1] 2/416] 
initial tableau: 
EI 


uninsert 7: 
8 | 
uninsert 7: 
8 
uninsert 4: 
3 | 
: 1 
uninsert 8: LL 
[318] 
uninsert 8: ae 
uninsert 3: BE 
uninsert 8: 
uninsert 3: empty 


FIGURE 9.2 
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Recording Tableau 


“4 BS a 
STS iS 
rs 


ES 
Ed 
[=] 


=| 
forfrs] 
| 
=| 


feof] [oo 
ES 
ES 

=| 


i 
2 


[oo] eo =] 
ees) 
RS 

=| 


Recording Tableau Output Value 


1121617] 
[3/4] 8) 


EGS 
Fal 
aD 


ES 
Fal 

I 
~~] 


=] EEA a 
S| TENS 
i 


lef] [eo] 
PS] DS 


=] 
His 
CO hm 


empty 3 


Mapping pairs of standard tableaux to permutations. 
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9.22 Inversion Property of RSK 


The phenomenon observed in the last example holds in general: if w maps to (P,Q) under 
the RSK correspondence, then w~! maps to (Q, P). To prove this fact, we must introduce 
a new way of visualizing the construction of the insertion and recording tableaux for w. 


9.100. Definition: Cartesian Graph of a Permutation. Given a permutation w = 
W1W2°++Wn € Sp, the graph of w (in the xy-plane) is the set G(w) = {(7,w;): 1 <i<n}. 


For example, the graph of w = 35164872 is drawn in Figure 9.3. 


coe eee ea ee ee a 


FIGURE 9.3 
Cartesian graph of a permutation. 


To analyze the creation of the insertion and recording tableaux for RSK(w), we annotate 
the graph of w by drawing lines as described in the following definitions. 


9.101. Definition: Shadow Lines. Let S = {(x1, y1),...,(@x,yx)} be a finite set of points 
in the first quadrant. The shadow of S$ is 


Shd(S) = {(u, v) € R? : for some i, u > a; and v > y}. 


Informally, the shadow consists of all points northeast of some point in S. The first shadow 
line L1(S) is the boundary of Shd($'). This boundary consists of an infinite vertical ray 
(part of the line 2 = aj, say), followed by zero or more alternating horizontal and vertical 
line segments, followed by an infinite horizontal ray (part of the line y = b;, say). Call a; 
and 6b; the x-coordinate and y-coordinate associated to this shadow line. Next, let S$; be the 
set of points in S that lie on the first shadow line of S. The second shadow line L2(S) is 
the boundary of Shd(S—S}), which has associated coordinates (az, b2). Letting Sz be the 
points in S that lie on the second shadow line, the third shadow line L3(S) is the boundary 
of Shd(S—($; U S2)). We continue to generate shadow lines in this way until all points of 
S lie on some shadow line. Finally, the first-order shadow diagram of w € S;, consists of all 
shadow lines associated to the graph G(w). 
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FIGURE 9.4 
Shadow lines for a permutation graph. 


9.102. Example. The first-order shadow diagram of w = 35164872 is drawn in Figure 9.4. 
The x-coordinates associated to the shadow lines of w are 1,2,4,6. These x-coordinates 
agree with the entries in the first row of the recording tableau Q(w), which we computed 
in Example 9.21. Similarly, the y-coordinates of the shadow lines are 1,2,6,7, which are 
precisely the entries in the first row of the insertion tableau P(w). The next result explains 
why this happens, and shows that the shadow diagram contains complete information about 
the evolution of the first rows of P(w) and Q(w). 


9.103. Theorem: Shadow Lines and RSK. Let w € S;, have first-order shadow lines 
[y,...,L£ with associated coordinates (x11, y1),-.-, (vx, yr). Let Po, Pi,..., Pn = P(w) and 
Qo, Q1,---;Qn = Q(w) be the sequences of tableaux generated in the computation of 
RSK(w). For 0 <i <n, the y-coordinates of the intersections of the shadow lines with the 
line « = i+ (1/2) are the entries in the first row of P;, whereas the entries in the first row of 
Q; consist of all x; < i. Whenever some shadow line L, has a vertical segment from (i, a) 
down to (i, b), then b = w; and the insertion P; = P;_1 < w; bumps the value a out of the 
rth cell in the first row of P;_1. 


Proof. We proceed by induction on i > 0. The theorem holds when 72 = 0, since Py and 
Qo are empty, and no shadow lines intersect the line « = 1/2. Fix 7 between 1 and n, and 
assume the result holds for 1 — 1. Then the first row of Pj_; is ay < ag <-+- < aj, which 
are the y-coordinates where the shadow lines hit the line 2 = i — 1/2. Consider the point 
(i, w;), which is the unique point in G(w) on the line x = 7. First consider the case w; > aj. 
In this case, the first 7 shadow lines all pass underneath (i, w;). It follows that (2, w;) is the 
first point of G(w) on shadow line Lj+1(G(w)), so 7;41 = 7%. When we insert w; into Pj-1, 
w; goes at the end of the first row of P;_1 (since it exceeds the last entry a;), and we place 
4 in the corresponding cell in the first row of Q;. The statements in the theorem regarding 
P,; and Q; are true in this case. Now consider the case w; < a;. Suppose a; is the smallest 
value in the first row of P;_; exceeding w;. Then insertion of w; into P;_; bumps a, out 
of the first row. On the other hand, the point (2, w;) lies between the points (i,a,-1) and 
(i,a,) in the shadow diagram (taking ag = 0). It follows from the way the shadow lines are 
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8+ 8+ 

7+ 7+ 

6+ 6+ 

5+ 5+ 

at at 

3+ 3+ 

2+ 2+ 

1+ 1+ 
j —}—_}_j jj __j 1 j —}—_}—_}__}_}_+_1 
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 

FIGURE 9.5 


Higher-order shadow diagrams. 


drawn that shadow line ZL, must drop from (i,a,) to (i,w;) when it reaches the line 7 = 7. 
The statements of the theorem therefore hold for i in this case as well. O 


To analyze the rows of P(w) and Q(w) below the first row, we iterate the shadow 
diagram construction as follows. 


9.104. Definition: Iterated Shadow Diagrams. Let [j,..., 0% be the shadow lines 
associated to a given subset S of R?. An inner corner is a point (a,b) at the top of one 
of the vertical segments of some shadow line. Let S’ be the set of inner corners associated 
to S. The second-order shadow diagram of S is the shadow diagram associated to S’. We 
iterate this process to define all higher-order shadow diagrams of S. 


For example, taking w = 35164872, Figure 9.5 displays the second-order and third-order 
shadow diagrams for G(w). 


9.105. Theorem: Higher-Order Shadows and RSK. For w € S,, let [1,..., LD, be 
the shadow lines in the rth-order shadow diagram for G(w), with associated coordinates 
(11, 41),---;(@k, ye). Let Po, Pi,...,P, = P(w) and Qo, Qi,.--,Qn = Q(w) be the se- 
quences of tableaux generated in the computation of RSK(w). For 0 < i < n, the y- 
coordinates of the intersections of the shadow lines with the line « = 7+ (1/2) are the 
entries in the rth row of P;, whereas the entries in the rth row of @; consist of all x; < 1. 
Whenever some shadow line L, has a vertical segment from (7,a) down to (i, 6), then b is 
the value bumped out of row r—1 by the insertion P; = P;_; < w;, and b bumps the value 
a out of the cth cell in row r of P;_1. (Take b = w; when r = 1.) 


Proof. We use induction on r > 1. The base case r = 1 was proved in Theorem 9.103. 
Consider r = 2 next. The proof of Theorem 9.103 shows that the inner corners of the 
first-order shadow diagram of w are precisely those points (i,b) such that b is bumped out 
of the first row of P;_; and inserted into the second row of P;_; when forming P;. The 
reasoning used in the previous proof can now be applied to this set of points. Whenever 
a point (7,0) lies above all second-order shadow lines approaching the line x = 7 from the 
left, b gets inserted in a new cell at the end of the second row of P;, and the corresponding 
cell in Q; receives the label 7. Otherwise, if (7, b) lies between shadow lines LZ. and L, in 
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the second-order diagram, then b bumps the value in the cth cell of the second row of P;_ 
into the third row, and shadow line LZ, moves down to level b when it reaches 7 = 71. The 
statements in the theorem (for r = 2) follow exactly as before by induction on i. Iterating 
this argument establishes the analogous results for each r > 2. O 


9.106. Theorem: RSK and Inversion. For all w € S,, if RSK(w) = (P,Q), then 
RSK(w7) = (Q, P). 


Proof. Consider the picture consisting of G(w) and its first-order shadow diagram. Suppose 
the shadow lines have associated x-coordinates (a1,...,@,) and y-coordinates (b1,..., bx). 
Let us reflect the picture through the line y = x (which interchanges x-coordinates and 
y-coordinates). This reflection changes G(w) into G(w—'), since (x,y) € G(w) iff y = w(a) 
iff ¢ = w—'(y) iff (y,x) € G(w7'). We see from the geometric definition that the shadow 
lines for w get reflected into the shadow lines for w+. It follows from Theorem 9.103 that 
the first row of both Q(w) and P(w7") is a1,...,a@%, whereas the first row of both P(w) and 
Q(w~*) is b1,...,bx. The inner corners for w~! are the reflections of the inner corners for 
w. So, we can apply the same argument to the higher-order shadow diagrams of w and w~t. 
It follows that each row of P(w~!) matches the corresponding row of Q(w), and similarly 
for Q(w~*) and P(w). Oo 


DT 


9.23. Words and Tableaux 


The RSK algorithm in the previous section sends permutations to pairs of standard tableaux 
of the same shape. We now extend this algorithm to operate on words w. Now the output 
RSK(w) is a pair of tableaux of the same shape, where the insertion tableau P(w) is 
semistandard and the recording tableau Q(w) is standard. 


9.107. Theorem: RSK Correspondence for Words. Let W = [N]"” be the set of 
n-letter words using the alphabet [NV]. There is a bijection 


RSK:W—+> [J SSYTw(A) x SYT(). 
AEPar(n) 


For all 7 € [N], i occurs the same number of times in w and in P(w). 


Proof. Given w = w,W2:::Wn € W, we define sequences of tableaux Po, Pi, ..., Py and 
Qo, Q1, ---; Qn as follows. Po and Qo are the empty tableau. If P;_; and Q;_; have been 
computed for some 7 with 1 <i <n, let P; = P,_, < w,;. Suppose this insertion creates 
a new box (c,d); then we form Q; from Qj;-1 by placing the value 7 in the box (c,d). By 
induction on i, we see that every P; is a semistandard tableau, every Q; is a standard 
tableau, and P; and Q; have the same shape. We set RSK(w) = (Pn,Qn). The letters in 
P,, (counting repetitions) are exactly the letters in w, so the last statement of the theorem 
holds. 

Next we describe the inverse algorithm. Given (P,Q) with P semistandard and Q stan- 
dard of the same shape, we construct semistandard tableaux P,, P,—1, ..., Po, standard 
tableaux Qn, Qn-1, ---; Qo, and letters wy, Wn_1,.-.,W1 as follows. Initially, P, = P and 
Qn = Q. Suppose, for some 7 with 1 < zi < n, that we have already constructed tableaux 
P, and Q; such that these tableaux have the same shape and consist of 7 boxes, P; is semi- 
standard, and Q; is standard. The value 7 lies in a corner cell of Q;; perform uninsertion 
starting from the same cell in P; to get a smaller semistandard tableau P;_, and a letter 
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w;. Let Q;j-1 be Q; with the 7 erased. At the end, output the word w,w2---w,. Using 
Theorem 9.54 and induction, it can be checked that w = w1--- Ww, is the unique word w 
with RSK(w) = (P,Q). So the RSK algorithm is a bijection. O 


9.108. Example. Let w = 21132131. We compute RSK(w) in Figure 9.6. 


Insertion Tableau Recording Tableau 


insert 2: 
insert 1: 
insert 1: 
insert 3: 
insert 2: 
insert 1: 
6 | 
insert 3: 
insert 1: [2 [5/8] 


FIGURE 9.6 
Computation of RSK(21132131). 


Next we investigate how the RSK algorithm is related to certain statistics on words and 
tableaux. 


9.109. Definition: Descents and Major Index for Standard Tableaux. Let Q be 
a standard tableau with n cells. The descent set of Q, denoted Des(Q), is the set of all 
k <nsuch that k+1 appears in a lower row of Q than k. The descent count of Q, denoted 
des(Q), is | Des(Q)|. The major index of Q, denoted maj(Q), is )),epes(q) K- (Compare to 
Definition 8.24, which gives the analogous definitions for words.) 


9.110. Example. For the standard tableau Q = Q(w) shown at the bottom of Figure 9.6, 
we have Des(Q) = {1,4,5,7}, des(Q) = 4, and maj(Q) = 17. Here, w = 21132131. Note 
that Des(w) = {1,4,5, 7}, des(w) = 4, and maj(w) = 17. This is not a coincidence. 


9.111. Theorem: RSK Preserves Descents and Major Index. For every word w € 
[N]” with recording tableau Q = Q(w), we have Des(w) = Des(Q), des(w) = des(Q), and 
maj(w) = maj(Q). 


Proof. It suffices to prove Des(w) = Des(Q). Let Po, Pi,...,Pn and Qo, Q1,.--,Qn = Q 
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be the sequences of tableaux computed when we apply the RSK algorithm to w. For each 
k <n, note that k € Des(w) iff we > we41, whereas k € Des(Q) iff k +1 appears in a row 
below & in Q. So, for each k < n, we must prove wy > wr41 iff k +1 is in a row lower 
than k in Q. For this, we use the Bumping Comparison Theorem 9.56. Consider the double 
insertion (Py_1 < we) < wey1. Let the new box in Py_1 < wp be (i, 7), and let the new 
box in (Py-1 < we) <— wz be (r,s). By definition of the recording tableau, Q(i,j) = k 
and Q(r,s) =k+1. Now, if we > we41, part 2 of Theorem 9.56 says that 7 < r (and j > s). 
So &+1 appears in a lower row than k in Q. If instead wy, < wpe41, part 1 of Theorem 9.56 
says that i >r (and j < s). So k +1 does not appear in a lower row than k in Q. O 


Define the weight of a value 7 € [N] to be xj, and define the weight of a word w = 
W1W2...Wn € [N]” to be x” = &y,0w, +++ Tw,,- Using the Product Rule for Weighted Sets, 
we find that 


> wt(w) = (a1 +--+ +2)” =pany(a1,..-,2N) = han (a,...,2N). (9.14) 
we[N]” 


Recall that han) = >) yepar(n) £a,(1") $a, Where Ky am) = | SYT(A)|. It follows that 


pan) =(t1++:-+an)"= S° |SYT(A)|sa(a1,..., 2). 
AEPar(n) 


Using the RSK algorithm and Theorem 9.111, we obtain the following t-analogue of this 
identity. 


9.112. Theorem: Schur Expansion of Words Weighted by Major Index. For all 
positive integers n and N, 


‘ye pmaj(w) .w _ ‘> pmai(Q) 8(21, tae ,0N). (9.15) 


we[N]” AEPar(n) \QESYT(A) 


Proof. The left side of (9.15) is the generating function for the weighted set [N]", where 
the weight of a word w is t™/(™)x”. On the other side, SSYTy(A) is a weighted set with 
wt(P) =x? for each semistandard tableau P. The generating function for this weighted set 
is precisely the Schur polynomial s)(a1,...,2y). Next, define weights on the set SYT(A) 
by taking wt(Q) = t™i(@) for Q € SYT(A). By the Sum and Product Rules for Weighted 
Sets, the generating function for the weighted set Z = U)cpar(ny SSY Tn (A) x SYT(A) is the 
right side of (9.15). To complete the proof, note that the RSK map is a weight-preserving 
bijection between [N]” and Z, because of Theorems 9.107 and 9.111. Oo 


9.113. Remark. The RSK correspondence can also be used to find the length of the longest 
weakly increasing or strictly decreasing subsequence of a given word. For details, see §12.13. 


DS 


9.24 Matrices and Tableaux 


Performing the RSK map on a word produces a pair consisting of one semistandard tableau 
and one standard tableau. We now define an RSK operation on matrices that maps each 
matrix to a pair of semistandard tableaux of the same shape. The first step is to encode 
the matrix using an object called a biword. 
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9.114. Definition: Biword of a Matrix. Let A = [a;;| be an M x N matrix with entries 
in Z>o9. The biword of A is a two-row array 


bw(A)=| “1 [2 77" oe 

Ji J2 "8" Dk 
constructed as follows. Start with an empty array, and scan the rows of A from top to 
bottom, reading each row from left to right. Whenever a nonzero integer a,j; is encountered 
in the scan, write down a;; copies of the column H at the end of the current biword. The 


top row of bw(A) is called the row word of A and denoted r(A). The bottom row of bw(A) 
is called the column word of A and denoted c(A). 


9.115. Example. Suppose A is the matrix 


The biword of A is 


IT tad2 22 2 3 
a= el eree | 
9.116. Theorem: Matrices and Biwords. Let X be the set of all 1/4 x N matrices with 
entries in Zo. Let Y be the set of all biwords w = Fe _ : | satisfying the 
- 1 j2 ‘Ik 


following conditions: (a) i; < ig <--+ <i,; (b) if ¢, = %541, then js < js4i; (Cc) 1 < is < MW 
for all s; (d) 1 < 7, < N for all s. The map bw : X > Y is a bijection. For all A € X, 
i appears D4 ajj times in r(A), j appears }>,a;; times in c(A), and bw(A) has length 
k= oad Ajj. 

Proof. To show that bw maps X into Y, we must show that bw(A) satisfies conditions (a) 
through (d). Condition (a) holds since we scan the rows of A from top to bottom. Condition 
(b) holds since each row is scanned from left to right. Condition (c) holds since A has 
rows. Condition (d) holds since A has N columns. We can invert the map bw as follows. 
Given a biword w € Y, let A be the M x N matrix such that, for all 7, 7 satisfying 1<i< M 
and 1< 7 < N, aj; is the number of indices s with 7, = 7 and j, = j. The last statements 
in the theorem follow from the way we constructed r(A) and c(A). oO 


9.117. Theorem: RSK Correspondence for Biwords. Let Y be the set of biwords 
defined in Theorem 9.116. Let Z = Uyep,, SSYTw(A) x SSYTy(A). There is a bijection 


RSK: Y > Z. If he € Y maps to (P,Q) € Z, then v and Q contain the same number of 


a’s for all z, and w and P contain the same number of j’s for all 7. 


Proof. Given a biword BF € Y, write v = 4 < ig < +++ < ty and w = Ji, jo,--- Jk; 


where i, = i541 implies 7, < j,41. As in the previous RSK maps, we build sequences of 
insertion tableaux Po, P},...,P, and recording tableaux Qo, Q1,...,Q x. Initially, Pp and 
Qo are empty. Having constructed P, and Qs, let Ps41 = Ps < js+1. If the new box created 
by this insertion is (a,b), obtain Q,41 from Q, by setting Q.41(a,b) = is4i. The final 
output is the pair (Px, Qz). 

By construction, P; is a semistandard tableau with entries consisting of the letters in 
w, and the entries of Q, are the letters in v. But, is Q = Q, a semistandard tableau? To 
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see that it is, note that we obtain Q by successively placing a weakly increasing sequence 
of numbers i; < ig < +++ < i, into new corner boxes of an initially empty tableau. It 
follows that the rows and columns of Q weakly increase. To see that columns of @ strictly 
increase, consider what happens during the placement of a run of equal numbers into Q, 
say 7 4s ds+1 tee iz. By definition of Y, we have js < jsz1 < ++: < je. When 
we insert this weakly increasing sequence into the P-tableau, the resulting sequence of new 
boxes forms a horizontal strip by Theorem 9.58. So, the corresponding boxes in Q (which 
consist of all the boxes labeled i in Q) also form a horizontal strip. This means that there 
are never two equal numbers in a given column of Q. 

The inverse algorithm reconstructs the words v and w in reverse, starting with i, and 
je. Given (P,Q), look for the rightmost occurrence of the largest letter in Q, which must 
reside in a corner box. Let i, be this letter. Erase this cell from Q, and perform reverse 
insertion on P starting at the same cell to recover j,. Iterate this process on the resulting 
smaller tableaux. We have i, > --- > 71 since we remove the largest letter in @ at each 
stage. When we remove a string of equal letters from Q, say 2 = i4 = i4-1 = +--+ = 7s, the 
associated letters removed from P must satisfy j, > j:-1 > +--+: > js. This follows from the 
Bumping Comparison Theorem 9.56. For instance, if 7;-1 > 3, then the new box created 
at stage t would be weakly left of the new box created at stage t — 1, which contradicts the 
requirement of choosing the rightmost 7 in Q when recovering 7; and j;. It follows that the 
inverse algorithm does produce a biword in Y, as required. O 


Composing the preceding bijections between the sets X, Y, and Z gives the following 
result. 


9.118. Theorem: RSK Correspondence for Matrices. For every M,N > 1, there is 
a bijection between the set of MZ x N matrices with entries in Zo and the set 


LJ ssyTw(A) x SS¥YTu(), 
AEPar 
which sends the matrix A to RSK(bw(A)). If [a;;] maps to (P,Q) under this bijection, then 
the number of j’s in P is }7, aij, and the number of i’s in Q is )7, aij. 


9.119. Example. Let us compute the pair of tableaux associated to the matrix A 
from Example 9.115. Looking at the biword of A, we must insert the sequence c(A) = 
(1,1,3,2,4,4, 4,3) into the P-tableau, recording the entries in r(A) = (1,1, 1,2, 2,2, 2,3) in 
the Q-tableau. This computation appears in Figure 9.7. 


9.120. Theorem: Cauchy Identity for Schur Polynomials. For all M,N > 1, 


MN 

1 
108 WO peepee es Si Yiy ssp U)SR Tigo 5 or): (9.16) 
j=1 j=1 TY Near 


Proof. We interpret each side as the generating function for a certain set of weighted objects. 
For the left side, consider M x N matrices with entries in Z>9. Let the weight of a matrix 
A= [ai5| be 

M N 

wt(A) = T] [ins 

i=1j=1 
We can build such a matrix by choosing the entries aj; € Z>o one at a time. For fixed i and 
j, the generating function for the choice of aj; is 


1 


1+ apy; + (wiyj)? +-+++ (iyj)* +++ = 1— xy; 
iY 
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Insertion Tableau Recording Tableau 


insert 1, record 1: 

insert 1, record 1: 

insert 3, record 1: 

insert 2, record 2: 

; }1 {142} 4 ji] 1 [1] 2) 

insert 4, record 2: 

insert 4, record 2: 
3 2 

insert 4, record 2: =pet 


1/1}2|4]4|4 

3 2 
. pretst4tay aya t a} 2f2]2) 
insert 3, record 3: 


FIGURE 9.7 
Applying the RSK map to a biword. 


By the Product Rule for Weighted Sets, we see that the left side of (9.16) is the generating 
function for this set of matrices. On the other hand, the RSK bijection converts each matrix 
A in this set to a pair (P,Q) of semistandard tableaux of the same shape. This bijection 
is weight-preserving provided that we weight each occurrence of j in P by y; and each 
occurrence of 7 in Q by x;. With these weights, the generating function for SSYT,\ (A) is 
8(y1,---, yn), and the generating function for SSYT yy (A) is 5) (a1,..., 2a). It now follows 
from the Sum and Product Rules for Weighted Sets that the right side of (9.16) is the 
generating function for the weighted set U\ep,, SSYTw(A) x SSYT (A). Since RSK is a 
weight-preserving bijection, the proof is complete. O 
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9.25 Cauchy’s Identities 


In the last section, we found a formula expressing the product [|], jd —xiyj)~' as a sum of 
products of Schur polynomials. Next we derive other formulas for this product that involve 
other kinds of symmetric polynomials. Throughout, we operate in the formal power series 
ring R[i[v1,...,2.,Y1,---,YN]]- 


9.121. Theorem: Cauchy’s Identities. For all M,N > 1, 


M N 


1 
li{— => » hy(a1,---,€u)my(y1,---, Yn) 
rariarsis 1l— xy; 
t=19=1 AE€Parn 
= SO m(a,...,em)haQ,---sy) 
A€Parm 


‘3 pr(t1,---,0ar)Pr(y1,---, YN) 


z 
A€Par A 
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Proof. Recall from Theorem 9.87 the product expansion 


M 1 lee) 
Ue =o hy(a1 geeey xu )t 
i=1 k=0 


M 1 lee) 
= )_ hAg(a1,.-.,2M)y 
Taking the product over 7 gives 
M oN N co 
k; 
lI{— =|] >) ha leas -s.5eady/? 
i=1j=1 1— riyj j=1 kj =0 


We can expand the product on the right side using the Generalized Distributive Law for 
formal power series (cf. Exercise 2-16). We obtain 


MN 
1 Openers —- >> So [Pha (en-au 
i=1 j=l ki=0 kn=07=1 


Let us reorganize the sum on the right side by grouping together summands indexed by se- 
quences (k1,...,ky) that can be sorted to give the same partition A. Since hz, hg, +--+ hey = 
hy for all such sequences, the right side becomes 


» hy(a1,---,0u) S- yt yh? «+ yR® 


A€Parn Se 


The inner sum is precisely the definition of m)(y1,...,yn). So the first formula of the 
theorem is proved. The second formula follows from the same reasoning, interchanging the 
roles of the z-variables and the y-variables. 

To derive the formula involving power sums, we again start with Theorem 9.87, which 


can be written 
MN 


Ul = Sohn (Fines Suen te". 


Replace the MN variables zz _ the MN are xiyj, Wherel <i< Mand1<j<N. 
We obtain 


MN oo 
l1[—_—- i = So ha(ciys, t1yo,.--,2myn)e” 
geigeit— Piyst A= 

i= 


Now use Theorem 9.91 to rewrite the right side in terms of power sums: 


MN 


Co 
TT = SY psoas. 2iyns stun) /2 
yt 


gal jel n=0 AE€Par(n) 
Observe next that, for all k > 1, 
M N M N 
pe(t1y1,-..,2Myn) = >» S (wins) = S- > vey; 
i=1 j=1 i=1 j=1 


-, 2M )PR(Y1, Y2; tee YN): 


lI 
ae 
Ms 
eS 
8 
aarp 
Se or 
Mez 
SS 
II 
s 
> 
8 
So 
8 
Y 
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Therefore, for any partition A, 


px (ery, ..., 2a yn) = Pa(Bis Wa, +, pr (yr, Yo, -- +. Yn) 


It follows that 


M N fore) 
1 ae eee 
| | | | = S 4” pr(X1, £2, ,TM)Pr(Y1, V2; »YN) (9.17) 
FL 44 1 — xiyjt Zp 
t= 1 9=1 J n=0 AEPar(n) 


Setting t = 1 gives the final formula of the theorem. O 


9.26 Dual Bases 


Now we introduce a scalar product on the vector spaces A‘. We only consider the case 
N >k, so that the various bases of AX, are indexed by all the integer partitions of k. 


9.122. Definition: Hall Scalar Product on Ak,. For N > k, define the Hall scalar 
product on the vector space A‘, by setting (for all p,v € Par(k)) 
(PusPv) =Oif AV, — (Py, Py) = 2 
and extending by bilinearity. In more detail, given f,g € A‘), choose scalars a,,, b,, € R such 
that f = Dai GpPp and g = >, bypy. Then (f,g) = a Appz © R. 
In the next definition, recall that y(~ = v) is 1 if ~ =v, and 0 otherwise. 


9.123. Definition: Orthonormal Bases and Dual Bases. Suppose N > k and B, = 
{fu : w € Par(k)} and Bo = {g, : uw € Par(k)} are two bases of Ak. By is called an 
orthonormal basis iff (f,, fv) = x(u = v) for all p,v € Par(k). By and Be are called dual 
bases iff (f.,9v) = x(u =v) for all u,v € Par(k). 

For example, it follows from Definition 9.122 that {py/./% :  € Par(k)} is an orthonor- 
mal basis of Ak,. The next theorem allows us to detect dual bases by looking at expansions 
of the product [], (1 — ziy;)~*. 


9.124. Theorem: Characterization of Dual Bases. Suppose N > k and B, = {f, : 
uw € Par(k)} and By = {g,,: uw € Par(k)} are two bases of AX,. B, and Bz are dual bases iff 


II; = SO fultr,--.,2n)guQ1,---.y)s 
0Yj ) 


i=1j=1 tk pePar(k 
where the left side denotes the coefficient of t* in the indicated product. 


Proof. Comparing the displayed equation to (9.17), we must prove that By, and Bz are dual 
bases iff 


a Pi Cincvay BH Dg Qhjoess Un) 2a = a Fil Diga ssw OG nligss 0 UN: 


we Par(k) uePar(k) 


The idea of the proof is to convert each condition into a statement about matrices. Since 
{py} and {p,,/z,} are bases of AX, there exist scalars a,,v, by, € R satisfying 


fv = ps GyvPp> gu = Ss byv (Py / Zp): 


EPar(k) we Par(k) 
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Define matrices A = [a,,,] and B = [b,,]. For all A,v € Par(k), we use bilinearity to 
compute 


(fry gv) — + Gy,\Pp> S- batel =n) 


wePar(k) p€Par(k) 


= S- On,r0p,v (Pus Pp/ Zp) 


L,pEPar(k) 


Orie SOB es: 
uEPar(k) 


I 


It follows that {f,} and {g,} are dual bases iff A''B = I, where I is the identity matrix of 
size | Par(k)|. 


On the other hand, writing x = (#1,...,2,) and y = (yi,..., yn), we have 
a fulx)guly) = Qa,u06,uPa(X)pB (y)/Ze- 
pw€Par(k) 1,a,8€Par(k) 


Now, one may check that the indexed set of polynomials 


{Pa(x)pe(y)/ze : (a, 8) € Par(k) x Par(k)} 


is linearly independent, using the fact that the power-sum polynomials in one set of variables 
are linearly independent. It follows that the expression given above for }> €Par(k) fulX)gu(y) 
is equal to) cpar() Pa(X)Paly)/Za iff D7), do,ub3,n = x(a = 8) for all a, 8. In matrix form, 
these equations say that AB™ = I. This matrix equation is equivalent to BA = I (since 
all the matrices are square), which is equivalent in turn to AB = I. We saw above that 
this last condition holds iff B; and Bz are dual bases, so the proof is complete. O 


9.125. Theorem: Dual Bases of AX. For N > k, {8,(21,...,@n) : w € Par(k)} is an 
orthonormal basis of A‘. Also, {m,(a1,...,@N) : @ € Par(k)} and {hy(21,...,unw) : we 
Par(k)} are dual bases of A‘. 


Proof. In Theorem 9.120, replace every x; by ta;. Since s, is homogeneous of degree |A|, 


we obtain oe 
1 Xr 
4. faa sa(yi,---,Yn)8a(a1,.--,2n)E! I. 


Ili; = > S\(Y1,---,YN)S\(@1,---, UN). 


tk AE Par(k) 


Theorem 9.124 now applies to show that {s, : \ € Par(k)} is an orthonormal basis. We 
proceed similarly to see that {m,,} and {h,,} are dual bases, starting with Theorem 9.121. 
O 


9.126. Theorem: w is an Isometry. For N > k, the map w: Ak, — A‘, is an isometry 
relative to the Hall scalar product. In other words, for all f,g € AX, (w(f),w(g)) = (f,9)- 
Therefore, w maps orthonormal bases to orthonormal bases and dual bases to dual bases. 
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Proof. Given f,g € AX, write f = Pod GpPp and g = >_,, bp, for certain scalars a,,, by € R. 
By linearity of w and bilinearity of the Hall scalar product, we compute 


(w(f),w(g)) = (. GuPp | > x by Pv ) 


ePar(k) v€Par(k) 


_ 2 S- ay by (w(pu), (pr) 


wePar(k) v€Par(k) 


So aybvévev (Pu, Pr) 
wePar(k) v€Par(k) 


2 
) Ap dye 2 p- 


wePar(k) 


l| 


l| 


The last step follows since we only get a nonzero scalar product when v = yu. Now, the last 
expression is 
S- Op bp 2. = > oe ayby (Dus Du) = ee 9) : O 
we Par(k) wePar(k) v€Par(k) 


9.127. Theorem: Duality of e,, and fgt,. For N > k, the bases {e, : w € Par(k)} and 
{fgt,, : 4 © Par(k)} (the forgotten basis) are dual. Moreover, 


II; = Do eaai,...,w)fsty(y1,---. yn): 


tk AEPar(k) 


Proof. We know that {m,,} and {h,} are dual bases. Since fgt,, = w(m,,) and e, = w(h,), 
{fgt,,} and {e,} are dual bases. The product formula now follows from Theorem 9.124. O 


9.27 Skew Schur Polynomials 


In this section we study skew Schur polynomials, which are the generating functions for 
generalized tableaux that can have missing squares in the upper-left corner. First we describe 
the diagrams we can use for these new tableaux. 


9.128. Definition: Skew Shapes. Let 4 and v be integer partitions such that dg(v) C 
dg(), or equivalently v; < yu; for alli > 1. Define the skew shape 
p/v = dg(u)—dg(v) = {(i,j) € Z39: 1 Sis Un),vi <5 < mi}. 

We visualize y/v as the collection of unit squares obtained by starting with the diagram 
of y and erasing the squares in the diagram of v. If v = (0), then u/(0) = dg(u). A skew 
shape of the form j/(0) is sometimes called a straight shape. 

9.129. Example. Let : = (7,7,3,3,2,1), v = (5,2,2,2,1), p = (6,5,4,3, 2,2), and 7 = 
(3, 3,3). The skew shapes y/v and p/7 are shown here: 
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Skew shapes need not be connected; for instance, (5,2, 2,1)/(3,2) looks like this: 


The skew shape p/v does not always determine y and v uniquely; for example, 
(5, 2,2, 1)/(3, 2) = (5,3, 2, 1)/(3, 3). 


9.130. Definition: Skew Tableaux. Given a skew shape u/v, a filling of this shape 
is a function T : p/v > Z. Such a filling T is a (semistandard) tableau of shape ju/v iff 
T(i,j) < Ti, 7 +1) and T(i, 7) < T(i+1,7) whenever both sides are defined. A tableau T 
is standard iff T is a bijection from p/v to {1,2,...,n}, where n = |u/v| = |p| — |v]. Let 
SSYT nv (ps/v) be the set of semistandard tableaux of shape j4/v with values in {1,2,...,N}, 
and let SYT(u/v) be the set of standard tableaux of shape y/v. For any filling T of w/v, 
the content monomial of T is x? = Weaeiuse LT (i,j): 


9.131. Example. Here is a semistandard tableau of shape (6, 5,5,3)/(3, 2,2) with content 


monomial x7 xqxr3ar4x502: 


9.132. Definition: Skew Schur Polynomials. Given a skew shape s1/v and a positive 
integer N, define the skew Schur polynomial in N variables by 
SyjulFizs.+,tn) = S- x! 
TESSYTN (p/V) 


9.133. Example. For p = (2,2), vy = (1), and N = 3, SSYTw(u/v) is the following set of 


tableaux: 
fa) fa} a a) a 2} 
f1}2} (1/3) [2[2} [2/3] [38/3} L113} [2/3] [813] 


So, $y/v(#1, 22,23) = xix + a7x3 + 7103 + 2x 973 + 21x32 + 2323 + XQx3. By chance, 
$(2,2)/(1) = $(2,1)- 


9.134. Remark. The polynomials e, and ha are special cases of skew Schur polynomials. 
For example, consider hy = ha, ha, ++: ha,. We have seen that each factor ha, is the gen- 
erating function for semistandard tableaux of shape (a;). There exists a skew shape pu/v 
consisting of disconnected horizontal rows of lengths a1,...,a;. When building a semistan- 
dard tableau of this shape, each row can be filled with labels independently of the others. 
So the Product Rule for Weighted Sets shows that ha(%1,...,@N) = 8y/p(@1,...,¢n). For 
example, given h2,4,3,2) = hahah3h2 = hya,3,2,2), we draw the skew shape: 


Then h2,4,3,2) = §(11,9,5,2)/(9,5,2)- An analogous procedure works for e,, but now we use 
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disconnected vertical columns with lengths aj,...,as5. For example, e(3,3,1) = Sy/v if we 
take 


p/v = = (3,3,3, 2, 2,2, 1)/(2, 2,2, 1, 1,1). 


For any skew shape p/v and a € Zo, define the skew Kostka number Kj,/y,. to be the 
coefficient of x® in s,/,(%1,...,2N), which is the number of semistandard tableaux with 
shape y/v and content a. The next result is proved exactly like the theorems in 89.5. 


9.135. Theorem: Symmetry and Monomial Expansion of Skew Schur Polynomi- 
als. For all skew shapes y/v and all a, 8 € ZX with sort(a) = sort(8), Kyjve = Ky/v,,- 


Hence, letting k = |u/v|, we have s,/, € AX and 
Spjv(Lig-s0¢tn) = Ss Foi jo xt big <12 ty), 
AEParn (k) 
Similarly, the proofs of Theorems 9.64 and 9.69 extend to prove the following result. 
9.136. Theorem: Skew Pieri Rules. For all 4 € Pary and all a € ZS, 
Sila = . Ky jy,a8X3 Sylq = ys Kyi yw o8d- 
A€Parn A€Parn 


In the next chapter, we prove that skew Schur polynomials are related to the Hall scalar 
product as follows: for any skew shape y/v with k cells and any f € Ak with N > k, 
(Su/v, f) = (Su, vf). This identity can be used to prove that w(8,/,) = Sy//v'- 


9.28 Abstract Symmetric Functions 


So far, we have discussed symmetric polynomials, which involve only finitely many variables 
X1,...,an. For most purposes, N (the number of variables) is not important as long as 
we restrict attention to symmetric polynomials of degree k < N. This section formally 
defines symmetric functions, which can be regarded intuitively as symmetric polynomials 
in infinitely many variables. 

Let K be a field containing Q. Recall that Ay is the algebra of symmetric polynomials 
in N variables 71,...,2y with coefficients in K. The key initial idea is to view Ay as the 
polynomial ring K[pi,...,pw], thinking of the power-sums p; as algebraically independent 
formal variables, not as polynomials in the x-variables (see §9.15). We can define a new 
K-algebra A by taking the union of these rings: 


A= U K[pi,..., pn] = K[pn INE Z>0]- 
N=1 


Given f,g € A, f and g both lie in K[pi,...,pyn] for some N, so f +g and f-g are defined. 
We call A the algebra of symmetric functions with coefficients in A. By definition, every 
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f € A is a finite linear combination of monomials, where each monomial is a product of 
a finite sequence of p,’s (repeats allowed). As before, given a sequence of positive integers 
a = (a4,...,Qs), we define pa = Pa, +**Pa,- Since multiplication in A is commutative, it 
follows that every f € A can be written uniquely as a sum f = >> CuPy Where c, € K 
and only finitely many c, are nonzero. 

In an ordinary polynomial ring K[x1,...,2y], each formal variable x; has degree 1. 
However, in the polynomial rings Ay and A, we define the degree of the formal power-sum 
p; to be j. The degree of a monomial cpg is then jal = a1 +--+: +a 5. Let A* be the set of 
f € A that are homogeneous of degree k. It is immediate that for all k > 0, {p, : uw € Par(k)} 
is a basis for A*, and that A is the direct sum of its subspaces A*. Moreover, for f € A* 
and g € A™, fg € A**™. So A is a graded commutative K-algebra. 

We can use power-sum expansions developed earlier to define symmetric function ver- 
sions of ha and ey. Specifically, for all n > 0, let 


_ -1 fae -1 
hn = 5 Zu Pus en = s Eure Py: 


“ePar(n) wePar(n) 


uecPar 


Then set ho = €9 = 1, he = [51 ho;, and ea = [],3, €e;. Schur symmetric functions can 
be defined either by the power-sum expansion given later in Theorem 10.53, or by inverting 
the linear system 


hu = ) Ky S)- 
A€Par 


As before, we define the Hall scalar product on A by letting (p), py) = x(A = wep 
for all partitions A and yp. The symmetric functions m,, can be defined as the unique basis 
dual to {h, : w € Par} with respect to this scalar product. Similarly, the forgotten basis 
is dual to the elementary basis. The Schur basis is orthonormal relative to the Hall scalar 
product. We define the algebra isomorphism w on A by specifying its effect on the generators: 
w(pn) = (-1)""1pp for all n > 1. It follows that w(ey) = hy, w(hy) = en, w(pu) = EuDp, 
and w(s,,) = s,’ for all partitions p. 

We can define evaluation homomorphisms with domain A as follows. Given any K- 
algebra B and arbitrary elements 6; € B, there exists a unique algebra homomorphism 
T:A— B such that T(p;) = 6; for all j > 1. The map T sends the symmetric function f = 
ucPar CuPp to T(f) = do epar Cu [i bu;- These homomorphisms enable us to connect 
abstract symmetric functions to concrete symmetric polynomials. Given N > 0, consider 


the evaluation homomorphism Ey : A > K[x1,...,¢Nn] sending the abstract power-sum p; 
to the power-sum polynomial p;(v1,...,7v) = 21 +24 +---4+ 2) for all j > 1. The image 
of Ey is An, viewed as a subring of K[a,...,2~]. By Theorem 9.79, Ey restricts to a 


vector space isomorphism from A* onto AX, as long as N > k. 

We can use these homomorphisms to transfer information about Ay to information 
about A. For example, let us show that the infinite list e,,e2,...,@n,... is algebraically in- 
dependent in A. This means that whenever a linear combination >7¢ 7 Cue, is zero (where 
each c,, € K and F is a finite set of partitions), then every c,, must be zero. Given such a lin- 
ear combination, let N be the maximum of all parts j4; of all partitions w € F’. Applying Ey 
to the given linear combination, we find that ae Cup (1,-.-,@n) =Oin K[a1,..., ry]. 
Since we know that the symmetric polynomials e1(a1,...,UN),---,@n(@1,---,UN) are al- 
gebraically independent, we can conclude that every c, is zero. A similar argument shows 
that hy, he,...,hn,... is an algebraically independent list in A. So, if we prefer, we could 
have defined A to be the polynomial ring K[hy, :n € Zso] or Kien : n € Zsol, viewing all 
Ay (or En) as formal variables. 

It may be tempting to think of the abstract symmetric function p; as the formal infinite 
sum )>~~_, x},. To justify this rigorously, we need the evaluation homomorphism from A into 
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co 
n=1 
Lj, Vig ++ Lj, and 


the formal power series ring K [[z, : n € Zso]] such that p; maps to p;(x) = )7°~., x}. It can 


be shown that this homomorphism sends e; to ex(x) = Dai <igco<igriy Ene 
8) to 8.(X) = Dressy ta) x’. However, it can be dangerous to think of A in this way. For 
example, we could try to extend Theorem 9.85 to the following identity involving infinitely 


many variables x;: 


[[G 420 = do en). 
i=1 n=0 


But the right side of this formula is not in the image of A under the evaluation homomor- 
phism sending f € A to f(x). The reason is that 577° 9 én is not an element of A, since only 
finite linear combinations of the p; are allowed. 

One way around this difficulty is to introduce a new formal variable t and work in the 


formal power series ring K[[t,71,x2,...]]. In this ring, 
co fo) . co 1 lo <) h 
IIa + tx;) = D_enlx)t and I =. D, bala 


although it must still be proved that the infinite products converge. For the Cauchy iden- 
tities, we work in the ring A'[[t,21,22,..., 41, y2,---]], where (for example) 
foe) 


II II = 7 as » Ay(x)mu(y) = Se » Sy(X)8,(y). 
"AJ k=0 k 


uePar(k) =0  pePar(k) 


Summary 


Table 9.1 summarizes information about five bases for the vector space AX, of symmetric 
polynomials in N variables that are homogeneous of degree k. The statements about dual 
bases assume N > k. Recall that Pary(k) is the set of integer partitions of k into at 
most N parts, while Pary(k)’ is the set of partitions of k where every part is at most N. 
Table 9.2 gives formulas and recursions for expressing certain symmetric polynomials as 
linear combinations of other symmetric polynomials. More identities of this type appear in 
the summary of Chapter 10. 


e Tableaux and Schur Polynomials. Given a partition jz, a semistandard tableau of shape p 
is a filling of the cells in dg(jz) so that rows weakly increase and columns strictly increase. 
The Schur polynomial in N variables indexed by p is 


Sy(1,-..,2N) = S- ae 


TESSYT y (1) 


where the power of x; is the number of 7’s in J’. Schur polynomials are symmetric, since an 
involution exists that switches the frequencies of 7’s and (i+1)’s in semistandard tableaux 
of shape yj. Similar remarks hold for skew Schur polynomials s,,/,, which enumerate se- 
mistandard tableaux using the shape /yv obtained by removing the diagram of v from 
the diagram of ju. 


e Orderings on Partitions. For u,v € Par(k),  <iex ¥ means that y= v or the first nonzero 
entry of vy — ps is positive; <jex is a total ordering on Par(k). We say <y (pz is dominated 
by v) iff wy +--+ +; <1 +---+1; for alli > 1; < is a partial ordering on Par(k). We 
have wv iff v' dy’ iff w can be transformed into v by a sequence of raising operators 
(moving one box to a higher row). Also, dv implies @ <jex Vv. 
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TABLE 9.1 
Bases for Ne 


Basis of A‘ Definition Dual Action of w 


Monomial {m,, : u € Pary(k)} 


acZyy: 
sort(a)= 


Elementary {e,, : 2 € Pary(k)’} 


Complete {h,, : u € Parn(k)} 


1<i1 Sta S++ <Stpn SN 
or {hy :  € Pary(k)’} ij Do reg Mga 4 
Power-sum {p,, : ps € Parny(k)’ = ye 
Schur {s, : uw € Pary(k)} 


TESSYT y(n) 


e Kostka Numbers. For u,v € Par and a € ZX: the Kostka number K,,/,,q is the number 
of semistandard tableaux of shape j1/v and content a. We have Ky.) = 1 for all  € Par, 
and K),,, #0 implies uJ A and fs <iex X. 


e Tableau Insertion. Given a semistandard tableau T’ and value x, we obtain a new semi- 
standard tableau T < « as follows. The element x bumps the leftmost value y > x in 
the top row into the second row, and this bumping continues recursively until a value is 
placed in a new box at the end of some row. The bumping path moves weakly left as it 
goes down. Insertion is invertible if we know which corner box is the new one. If we insert 
a weakly increasing sequence into J’, the new boxes move strictly right and weakly higher, 
producing a horizontal strip. If we insert a strictly decreasing sequence into T, the new 
boxes move weakly left and strictly lower, producing a vertical strip. 


e The Pieri Rules. (a) she = >>, 8, where we sum over all v such that v/y is a horizontal 
strip of size k. (b) s,ex = >>, 8, where we sum over all v such that v/y is a vertical strip 
of size k. If there are N variables, only shapes v with at most N parts contribute nonzero 
terms to the sum. 


e Algebraic Independence. A list of polynomials f;,...,f, is algebraically independent iff 
the set of monomials { ig vee iy : ij € Zso} is linearly independent. Equivalently, the 
evaluation homomorphism with domain R[y;,...,yx] sending y; to f; is one-to-one. The 
list fi,..-, fe € Rlwi,..., 2%] is algebraically independent if and only if det [$4] 1 <i,j5< x 


’ 


e Algebraically Independent Symmetric Polynomials. In the ring R{w1,...,2,], the lists 
Pi,---,pn and hy,...,hyn and e1,...,en are algebraically independent. So there are three 
isomorphisms from the polynomial ring R[z1,..., zy] onto the algebra Ay (we can send 
each z; to p;, to hj, or to e;). 
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TABLE 9.2 
Expansions and recursions for symmetric polynomials. 


Monomial expansion of Schur basis = s) = Do wePar(|X ) Kapp 


Schur expansion of complete basis he = €Par(lal) Ky aS 
Schur expansion of elementary basis eg = AEP ar(|a ) Ky aS 
Power-sum expansion of hy, hn = ucPar(n %. Bis 
Power-sum expansion of €,, Cn = di yePar(n) u2 thi, 

Schur expansion of pin) Pan) = dyepar(n) |SYTOA)|sa 
Monomial expansion of skew Schur — s,,/) = pePaa(l/ul) Ky /v,p™Mp 
Schur expansion of s,ha Siler =F jy Fi picid 

Schur expansion of s,,€q Spa = 21, Kyi /p,aSd 
Recursion linking e’s and h’s gol) ein = x= 0) 
Recursion linking h’s and p’s og hPa: = nh 

Recursion linking e’s and p’s ig (-1)*espn—s = (—1)" “nen 


e Generating Functions for e’s and h’s. We have 


N N 
En(t) = [[¢ + xt) = So ex(a1, etme 
i=1 k=0 
N N 
Hn) =|[0—2a)* =) tein. en)t® 
i=l k=0 


so Hy (t)En(-t) =I. 


e Dual Bases and Cauchy Identities. Assume N > k. The Hall scalar product on A‘, is 
defined by setting (p,,,Pv) = Z.X( = v) and extending by bilinearity. Two bases {f,, : 
uw € Par(k)} and {g,, :  € Par(k)} of AX, are dual relative to this inner product iff they 
satisfy the Cauchy identity 


N N 
1 
coefficient of t” in ME aa oe fu(ti,-..,¢n)gu(y1,---, yn). 
i=19=1 uePar(k) 
In particular, this identity holds with f,, = gy = Sy; fy = Mp and g, = hy; and fy, = py 
and gn = Pu/ Zp: 


e The Map w. There is a unique algebra isomorphism w : Ay — Ay such that w(p;) = 
(—1)'~1p; for all ¢ between 1 and N. We have w(p,,) = €uPy, where €, = (—1)/HI-&); 
we) = Ap, wW(hp) = en, and w(s,) = s,”. The map w is an involution (wow = id). For 
k < N,w is an isometry of Ak, which means (w(f),w(g)) = (f,g) for all f,g € AX. 


e RSK Correspondences. There are bijections between: (a) permutations in S;, and pairs 
(P,Q) of standard tableaux of the same shape  € Par(n); (b) words in [N]” and pairs 
(P,Q) where P € SSYTw(A) and Q € SYT(A) for some \ € Par(n); (c) M x N matrices 
with values in Zso and pairs (P,Q) where P € SSYTw(A) and Q € SSYTys(A). In each 
case, one inserts successive entries into P, using @ to record the locations of new boxes. 
For (c), one must first encode the matrix as a biword. If w € S;, maps to (P,Q), then wt 
maps to (Q, P). If w € [N]" maps to (P,Q), then Des(w) = Des(Q), des(w) = des(Q), 
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and maj(w) = maj(Q), where Des(Q) is the set of k <n such that &+ 1 is in a lower row 
of Q than k, des(Q) = | Des(Q)|, and maj(Q) = ))penes(ay *: 


e Symmetric Functions. Viewing each power-sum p, as a formal variable, we can define the 
algebra of symmetric functions A = K[p, : n € Zyo]. Symmetric functions h,,, e,, and 
s, can be defined via their expansions in the power-sum basis. We can define evaluation 
homomorphisms with domain A by sending each p,, to any element of a given K-algebra. 
Many identities for symmetric polynomials in finitely many variables extend to the setting 
of symmetric functions. In particular, {e, :n € Zso} and {hy : n € Zso} are algebraically 
independent subsets of A. 


(I 
Exercises 


9-1. Consider fillings of shape 4 € Par(k) using values in [N]. (a) How many fillings are 
there? (b) How many fillings have weakly increasing rows? (c) How many fillings have 
strictly increasing columns? 

9-2. List all the tableaux in: (a) SSYT5((8, 2)); (b) SSYT2((3, 2)); (c) SYT((3, 2, 1)). 

9-3. Give a direct counting argument to find | SYT()| when yz = (a, 1°) is a hook shape. 
9-4. Compute s(2,2)(%1,--., an) for N = 3,4,5 by enumerating tableaux. 

9-5. Find the coefficients of the following monomials in s(3,21)(#1,.--,%6) by enumerating 
tableaux: (a) 212203240526; (b) x?x3x2; (c) xpa3; (d) r?xor3x425; (e) 21 L2xkr4%5; 

(f) xirorgx4r2. 

9-6. List all terms in: (a) p4a(a1,2%2,23); (b) e3(a1,¥2,03,04,2%5); (c) h3(a1, 72,23); 
(d) 1(3,2,2) (2, U2, U3, ta). 

9-7. For w = (2,1), compute: (a) p,(v1, 22, ©3); (b) en (#1, £2, 03); (Cc) hy (x1, £2, v3). 

9-8. Find how many monomials appear with nonzero coefficient in: (a) ex(x1,...,2N); 
(b) hx (21, aaa pty) (c) My(L1, id 2 ,tN). 

9-9. Give a direct proof that the polynomials e,(a#1,...,2y) and hg (a1,...,2n) (as defined 
in Definitions 9.15 and 9.16) are symmetric. 

9-10. Check that the set of homogeneous polynomials of degree k is a vector subspace of 
R[x1,...,2n] for all k > 0. Conclude that A‘, is a subspace of Ay for each k > 0. 

9-11. List five bases for the real vector space A. 

9-12. Compute the dimension of AR for all k, N in the rangel<k<6and1<N <6. 
9-13. Suppose { f; : 7 € I} is a collection of nonzero polynomials in R[a,...,7~] such that, 
whenever some x® appears in some f; with nonzero coefficient, the coefficient of x* in every 
other f; is zero. Prove that {f; : 7 € I} is linearly independent. 

9-14. Compute the following Kostka numbers: (a) K(3,3,2),(2,1,2,1,1,1)3 (b) K(3,2,2,1),(2,2,1,1,1,1)3 
(c) K(5,5),(11°); (d) K(3,3,3)/(2,1),(2,2,1,1) 

9-15. Compute the image of the first tableau in the proof of Theorem 9.27 under the maps 
fi, for i = 1,2,4,5,6,7,8. 

9-16. Express the Schur polynomials s,,(@1, 72, £3, 4, %5) as explicit linear combinations of 
monomial symmetric polynomials, for all in Par(4) and Par(5). 

9-17. (a) Find a recursion characterizing the Kostka numbers K,,/),.. (b) Use (a) to write 
a computer program for computing Kostka numbers. 
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9-18. Check that <j-x is a total ordering of the set Par(k), for each k > 0. 

9-19. Prove that <J is a total ordering of Par(k) iff k < 5. 

9-20. (a) List the integer partitions of 7 in lexicographic order. (b) Find all pairs p,v € 
Par(7) such that  <jex v but pA v. 

9-21. (a) Find an ordered sequence of raising operators that changes uw = (5,4,2,1,1) to 
v = (7,3,2,1). (b) How many such sequences are there? 

9-22. Prove or disprove: for all partitions w,v € Par(k), @ <iex v iff v! <jex p’. 

9-23. Let u,v € Par(k). Can you prove that w<v implies v’ dy’ directly from the definitions, 
without using raising operators? 

9-24. Define an ordering <jex on the set ZX as in Definition 9.30. Show that <j<x is a total 
ordering of ZX satisfying the following well-ordering property: every nonempty subset of 
ZXo has a least element relative to <tcx. 

9-25. Define the lex degree of a nonzero polynomial f(a1,...,7.) € Riwi,...,2], denoted 
degl(f), to be the largest a € ZX (relative to the lexicographic ordering defined in the 
previous exercise) such that x° occurs with nonzero coefficient in f. Prove that degl(gh) = 
degl(g) + degl(h) for all nonzero g,h, and degl(g + h) < max(degl(g), degl(h)) whenever 
both sides are defined. 

9-26. (a) Find the Kostka matrix indexed by all partitions of 4. (b) Invert this matrix, and 
thereby express the monomial symmetric polynomials m,,(%1, 72, £3, x4) (for ps € Par(4)) as 
linear combinations of Schur polynomials. 

9-27. Find the Kostka matrix indexed by partitions in Par3(7). Invert this matrix. 

9-28. Let K be the Kostka matrix indexed by all partitions of 8. How many nonzero entries 
does this matrix have? 


9-29. Suppose A is ann x n matrix with integer entries such that det(A) = +1. Prove that 
A~! has all integer entries. In particular, this problem applies when A is a Kostka matrix. 
[Hint: See Corollary 12.54.] 

9-30. Suppose {v; : 7 € I} is a basis for a finite-dimensional real vector space V, {w; : 2 € I} 
is an indexed family of vectors in V, and for some total ordering < of J and some scalars 
ayy € R with ay A 0, we have w; = ij <i Vij U5 for all 2 € I. Prove that {w; :i¢ I} isa 
basis of V. 7 

9-31. Fix N > k, and define column vectors s, m, e, h whose entries are s,,, M,, €,, and h, 
(respectively) with j ranging over partitions of k. For each ordered pair of column vectors 
v, w in this list, find the matrix A such that v = Aw. Express each answer using the 
Kostka matrix and its variations. 

9-32. Let T be the tableau in Example 9.47. Confirm that T < 1 and T < 0 are as stated 
in that example. Also, compute T < 7 for i = 2,4,5,7, and verify that Theorem 9.48 holds. 


9-33. Let T be the semistandard tableau shown here: 


Compute T <i for 1 <i<9. 


9-34. Give a non-recursive description of T < x in the case where: (a) x is larger than 
every entry of T; (b) x is smaller than every entry of T. 
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9-35. Let T be the tableau in Example 9.47. Perform reverse insertion starting at each 

corner box of T to obtain smaller tableaux T; and values x;. Verify that T; < x; = T for 

each answer. 

9-36. Let T be the tableau in Exercise 9-33. Perform reverse insertion starting at each 

corner box of T, and verify that Theorem 9.52(a) and (b) hold in each case. 

9-37. Prove Theorem 9.52(c). 

9-38. Prove Theorem 9.53. 

9-39. Express 5(4,4,3,1,1)h1 as a sum of Schur polynomials. 

9-40. Let T be the tableau in Example 9.47. Successively insert 1,2,2,3,5,5 into T, and 

verify that the assertions of the Bumping Comparison Theorem hold. 

9-41. Let T be the tableau in Example 9.47. Successively insert 7,5, 3,2, 1 into T, and verify 

that the assertions of the Bumping Comparison Theorem hold. 

9-42. Let T be the tableau in Exercise 9-33. Successively insert 1,1,3,3,3,4 into 7, and 

verify that the assertions of the Bumping Comparison Theorem hold. 

9-43. Let T be the tableau in Exercise 9-33. Successively insert 7,6,5,3,2,1 into 7, and 

verify that the assertions of the Bumping Comparison Theorem hold. 

9-44. Let T be the tableau in Example 9.55 of shape yw = (5,4,4,4,1). For each shape v 

such that v/p is a horizontal strip of size 3, find a weakly increasing sequence 71 < 2g < x3 

such that (((T < 21) < x2) < a3) has shape v, or prove that no such sequence exists. 

9-45. Repeat the previous exercise, replacing horizontal strips by vertical strips and weakly 

increasing sequences by strictly decreasing sequences. 

9-46. Prove Theorem 9.59(b). 

9-47. Let T be the tableau in Exercise 9-33. Find the unique semistandard tableau S of 

shape (7,5,4,4,1,1) and 21 < z2 < 23 < z4 such that T= S © 21292324. 

9-48. Let T be the tableau in Exercise 9-33. Find the unique semistandard tableau S of 

shape (6,6,5,3,2) and 21 > zg > 23 > z4 such that T= S € 21292324. 

9-49. Expand each symmetric polynomial into sums of Schur polynomials: (a) s(4,3,1)€2; 

(b) 8(2,2)h3; (€) 8(2,2,1,1,1)h43 (d) (3,3,2)e3- 

9-50. Use the Pieri Rule to find the Schur expansions of h(3 9,1), 4(3,1,2), 2(1,2,3), and h1,3,2), 

and verify that the answers agree with those found in Example 9.62. 

9-51. Expand each symmetric polynomial into sums of Schur polynomials: (a) h2,2,2); 

(b) hos,3)3 (c) 8(3,2)h(2,1)3 (d) §(6,3,2,2)/(3,2)- 

9-52. Find the coefficients of the following Schur polynomials in the Schur expansion of 

hi3,2,2,1,1): (@) 8(9)3 (b) 8(5,4)3 (€) $(4,4,1)3 (4) $(2,2,2,2,1); (€) $(3,3,3)3 (£) $(3,2,2,1,1)- 

9-53. Use Remark 9.67 to compute the monomial expansions of h,,(@1, v2, %3, 4) for all 

partitions ys of size at most four. 

9-54. Let a = (a1,...,a@,). Prove that the coefficient of m)(#1,...,2.) in the monomial 

expansion of ha(#1,...,¢Nn) is the number of s x N matrices A with entries in Zo such 
N - : 5 a ; 

that gai Ali, j) = oy for 1 <i <8 and 57, A(i,j) =A, for l<j =. 

9-55. Find and prove a combinatorial interpretation (similar to the one in the previous 

exercise) for the coefficient of m)(a1,...,@n) in the monomial expansion of e4(a1,...,UN). 

9-56. Use the Pieri Rules to compute the Schur expansions of: (a) €(3,3,1); (b) €(5,3); 

(c) §(3,2)€(2,1)3 (d) §(4,3,3,3,1,1)/(3,1,1,1)- 

9-57. Find the coefficients of the following Schur polynomials in the Schur expansion of 

€(4,3,2,1): (a) §(4,3,2,1)3 (b) 5(5,5)3 (c) 5(2,2,2,2,2)5 (d) $(2,2,2,14)3 (e) 5(110). 
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9-58. Use Remark 9.72 to express e€(2,2,1) and €(3,2) as linear combinations of monomial 
symmetric polynomials. 

9-59. Prove: for all N > k, pp(a1,...,un) = 1 seca ie, 12+, 0N). 

9-60. Express s(2.2)p3 as a sum of Schur polynomials. 

9-61. Conjecture a formula for expressing spy as a linear combination of Schur polynomials. 
Can you prove your formula? 

9-62. Prove that the following lists of polynomials are algebraically dependent by exhibiting 
a non-trivial linear combination of monomials equal to zero: (a) hi(a1,22) for 1 <i < 3; 
(b) e;(@1,%2,x%3) for 1 <i <4; (c) pi(a1, 22,23) for 1 <i< 4. 

9-63. Prove that any sublist of an algebraically independent list is algebraically independent. 
9-64. Suppose a = (ay1,...,aNn) € ZX is a partition. Show that 


a, —-a2_ Aa2—-A3 AN-1-QAN AN) __ 
degl(e; €9 Tr en_y en ) =a 


(see Exercise 9-25 for the definition of lex degree). 


9-65. Algorithmic Proof of the Fundamental Theorem of Symmetric Polyno- 
mials. Prove that the following algorithm will express any f € Ay as a polynomial in 
the elementary symmetric polynomials e;(x1,...,2y) (where 1 <i < N) in finitely many 
steps. If f = 0, use the zero polynomial. Otherwise, let the term of largest lex degree in f 
be cx® where c € R is nonzero. Use symmetry of f to show that a; > a2 >--- > ay, and 
that f — ce{?~° eg? 9 --- eG OX eX is either 0 or has lex degree B <jex a. Continue 
similarly to express this new polynomial (and hence f) as a polynomial in the ¢;. 

9-66. Use the algorithm in the preceding exercise to express m2,1)(#1,%2,73,%4) and 
p3(x“1,€2,%3,24) as polynomials in {e; (a1, 72, 73,04): 1 <i < 4}. 

9-67. Use the test in Theorem 9.76 to verify that the polynomials h;(#1, 72,73) forl <i<3 
are algebraically independent. Can you generalize this computation to more than three 
variables? 


9-68. Use the test in Theorem 9.76 to verify that the polynomials e;(#1,72,23,x74) for 
1 <i <A are algebraically independent. Can you generalize this computation to more than 
four variables? 


9-69. Compute the images of 


and | 4, 


under the involution J in the proof of Theorem 9.81. 

9-70. List all the matched pairs (z,I(z)) in the proof of Theorem 9.81 when: (a) N = 2 
and m = 3; (b) N =3 and m= 2. 

9-71. Imitate the proof of Theorem 9.82 to show that algebraic independence of hi,...,hw 


in R[az,,...,7n] implies algebraic independence of e1,..., en. 
9-72. (a) Prove the recursion eg(a@1,...,@N) = ex (@1,---,U@nN—-1) + €x—-1(41,---, UN-1)2N 
for k, N > 1. What are the initial conditions? (b) Find a similar recursion for hy(a1,..., 2). 


9-73. (a) Prove s’(n,k) = en—x(1,2,...,2—1). (b) Prove S(n,k) = hn—~(1, 2,...,4). 
9-74. Prove Theorem 9.86 by expanding iN pare 3 —r;) using the Generalized Distributive 
Law. 

9-75. Consider the polynomial p = z° — 22+ + 52° + 7x? — x — 4, which has five 
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roots r1,...,75 € C. Compute: (a) the sum of the roots; (b) the product of the roots; 


9-76. Use Theorem 9.87 to calculate the coefficient of x* in the multiplicative inverse of 
(1 — 2a)(1 — 3x)(1 — 52). 

9-77. Let A be an n x n complex matrix. What is the relationship between the coefficients 
of the characteristic polynomial det(tZ — A) and the eigenvalues r),..., 17%, of A? 

9-78. Use (9.10) to show that p;(a1,...,an) (for 1 <i < N) are algebraically independent 
iff h;(a1,...,@n) (for 1 <i < N) are algebraically independent. 

9-79. Use (9.11) to show that p;(a1,...,an) (for 1 <i < N) are algebraically independent 
iff e;(a1,...,un) (for 1 <i < N) are algebraically independent. 


9-80. Consider the maps f and g from the proof of Theorem 9.88. Compute 
f(5, [2]4] 41515}, [4]4]) and g([1fi"]1]212[4]6]6)). 


9-81. Consider the maps J and g from the proof of Theorem 9.89. Compute 


I ( Ey ? I 3, Ey ? I 3, 13 |, ? I 3, oh ° 


For any objects that are fixed points of J, compute the images of those objects under g. 


9-82. Write y 4 7 in terms of symmetric polynomials. 
9-83. Obtain Theorems 9.88 and 9.89 algebraically by differentiating the generating func- 
tions Hy(t) and Ey (-t). 

9-84. Use the recursions in Theorems 9.88 and 9.89 to verify the formulas for hy, 3!e3, and 
Ale, stated in Example 9.90. 


9-85. Complete the proof of Theorem 9.91 by checking that 9(f(yo)) = yo and f(g(z0)) = Zo, 
and, in general, go f =idy and fog = idx. 


9-86. Let g be the map in the proof of Theorem 9.91. Compute g(z1) and g(z2), where 
—_|w: 3 72 5 46 8 14. —_{|w: 2 1 4 3 7 5 6 8 
a |e: 2 ew Ad eh Gl Oe te eS Se? 

9-87. Let f be the map in the proof of Theorem 9.91. Compute f(y), where 

123 4 5 6 7 8 
y= (2,2,2,2),2,58,8467,| 5 9 § 2229 3|)° 


9-88. Let I be the involution in the proof of Theorem 9.93. Compute I(z1), I(z2), and 
I(f(y)), where z1, 22, and y are the objects given in the preceding two exercises. 


9-89. Let A be an n xX n complex matrix with eigenvalues r1,...,7. (a) Show that the 
trace of A, defined by tr(A) = 7, A(i,#), is pi(ri,...,7n). (b) For & > 1, express 
tr(A*) as a function of 7y,...,7%- (c) Suppose nm =.5 and (tr(A*) > k = 1,2,...,5) = 
(3, 41, —93, 693, —2957). Find the characteristic polynomial of A. 

9-90. Compute: (a) w(h3); (b) w(p(3,2,1,1)); (€) w(ea,ay); (d) w(5(5,3,3,1,1,1))- 

9-91. Use facts about w to deduce Theorem 9.69 from Theorem 9.64. 

9-92. Use facts about w to deduce Theorem 9.89 from Theorem 9.88. 


9-93. Use w to deduce Theorem 9.93 from Theorem 9.91. 
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9-94. Show that there exists an algebra isomorphism of Aj sending each p; to —p;. Compute 
the images of h,, and e, under this isomorphism. 


9-95. In the proof of Theorem 9.95(b), where is the assumption n < N needed? 
9-96. Compute the polynomials fgt, (21,22, x3) for all partitions of size at most 3. 
9-97. Compute RSK(w) for all w € Ss. 

9-98. Compute RSK~'(P, Q) for all pairs P,Q of standard tableaux of shape (2, 2). 


9-99. Let w = 41572863 € Sg. Compute RSK(w) and RSK(w7?). Verify that Theorem 9.106 
holds in this case. 


9-100. Given the pair of standard tableaux: 


11316) 1] 215) 
P=(|2/4/8, Q=(3/6|7| 
[517 [418] 


Compute w = RSK~'(P,Q) and v = RSK~'(Q, P), and verify that Theorem 9.106 holds 
in this case. 


9-101. (a) Verify that Theorem 9.103 holds for the example w = 35164872 by comparing 
the first rows of the tableaux in Figure 9.1 with the shadow diagram in Figure 9.4. (b) Verify 
the assertions in Theorem 9.105 using Figure 9.5. 


9-102. Draw all the shadow diagrams for the permutations w and w7! in Exercise 9-99, 
and use them to verify the assertions in Theorem 9.105 for this example. 


9-103. Draw all the shadow diagrams for the permutations w and v in Exercise 9-100, and 
use them to verify the assertions in Theorem 9.105 for this example. 


9-104. (a) Prove: for alln > 1, n! = do ycpar(ny | SYT(A)|?. (b) Verify this identity directly 
for n= 5, 

9-105. Show that the number of w € S;, such that w? = id is given by 7) cpar(ny |SYT()I- 
9-106. Suppose w’ € S;,_1 is obtained from w € Sj, by deleting n from the one-line form 
of w. How is P(w’) related to P(w)? 

9-107. Compute RSK(w) for all words w € {1,2}%. 

9-108. Compute RSK(313211231), and verify that Theorem 9.111 holds in this case. 
9-109. Compute the word w such that 


[1}i}2]2] [1/2] 4/6) 
RSK(w) = | [2[3[4/4] [3/5/81 | - 
4]5]5} | Lz] 9 ful 


Verify that Theorem 9.111 holds in this case. 
9-110. (a) Compute VP resy-r((4,1)) gz), (b) Compute Ve resyT((3,2,1)) gna), 
9-111. The reading word of a semistandard tableau T is the word rw(T) obtained by listing 


the values in each row of T from left to right, working from the bottom row to the top row. 
Prove that P(rw(T)) = T. What is Q(rw(T))? 


9-112. Given w = wi---Wn € Sn, let rev(w) = wy--+ wy be the reversal of w. By looking 
at examples, conjecture a relationship between P(rev(w)) and P(w). Can you prove your 
conjecture? Does the conjecture extend to arbitrary words w? 


9-113. Express pc;4) as a linear combination of Schur polynomials. 
9-114. (a) Compute the biword and the pair of tableaux associated to the matrix 
1 


. (b) Do the same for the transpose of this matrix. 
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9-115. (a) Compute the matrix and pair of tableaux associated to the biword 


1122 2 3 3 5 
24113 3 3 24° 


(b) Do the same for the biword obtained by switching the two rows and sorting the new 
top row into increasing order (using the values in the bottom row to break ties). 


9-116. (a) Compute the biword and matrix associated to the pair of tableaux: 


(b) Do the same for the pair of tableaux (Q, P). 


9-117. Show that if a matrix A maps to (P,Q) under the RSK correspondence, then the 
transpose matrix A’ maps to (Q, P) under RSK. Do this by generalizing the shadow con- 
structions in §9.21, allowing more than one dot to occupy a given point (i, 7) in the graph. 


9-118. Verify the fact (used in the proof of Theorem 9.124) that the polynomials 
{Pa(x)pa(y)/z : (a, 8) € Par(k) x Par(k)} 

are linearly independent. 

9-119. Suppose A and B are n x n matrices such that AB = I. Prove that BA = I. 

9-120. Suppose { f,, : 4 € Par(k)} is an orthonormal basis of A‘,, and g € AX. Prove that 

C= y weP ane) (9) fu) fu 

9-121. Prove: i en Neer! + aiy;) = ee €p(®1,---,0u)My(Y1,---, Yn). 

9-122. Prove: []j2; ji + rey) = Dyepar (uPu(@1s ++») 2M )Pu(yas--- Yn)/Zp- 

9-123. Prove: TL Ti +2iy;) = VucPar $u(21,---,0M) Sp’ (Yi,---,yn)- (Hint: Use w.) 


9-124. Given a skew shape S C Zo, describe how to calculate the number of different 
pairs of partitions (1,7) such that S = y/v. 


9-125. Find necessary and sufficient algebraic conditions on the parts of js and v to ensure 
that the skew shape j/v is: (a) a horizontal strip; (b) a vertical strip. 


9-126. How many horizontal strips are contained in {1,2,...,a} x {1,2,..., bd}? 

9-127. Prove that | SYT(u/v)| = |SYT(y’/v’)| for all skew shapes ju/v. 

9-128. Compute s(3,2)/(1)(%1,-..,@N) for N = 2,3,4 by enumerating tableaux. 

9-129. Let N > 4. Enumerate tableaux to confirm that the coefficients of wi tore ta, 
x 030304, and 21 73x3x4 in 8(4,3)/(1) (£1, ...,@y) are all equal to 6. What happens to these 
coefficients if N < 4? 

9-130. For which values of N is 8,/)(@1,..-,%n) = 0? 

9-131. Find a skew shape ju/y such that eghahzeshi = 8,/p- 

9-132. Prove that any finite product of skew Schur polynomials is a skew Schur polynomial. 


9-133. Express the skew Schur polynomial 8(3,3,2)/(1)(%1,---,@g) as a linear combination of 
monomial symmetric polynomials. 


9-134. For all partitions  € Par(3), express h, and e,, in terms of monomial symmetric 
polynomials by viewing h, and e, as instances of skew Schur polynomials. 


9-135. Suppose we apply the Tableau Insertion Algorithm 9.46 to a tableau T of skew 
shape. Are Theorems 9.48 and 9.49 still true? 


9-136. Prove Theorem 9.135. 
9-137. Prove Theorem 9.136. 
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Notes 


Ian Macdonald’s book [84] contains a comprehensive treatment of symmetric polynomials, 
with a heavy emphasis on algebraic methods. A more combinatorial development is given by 
Stanley in Chapter 7 of [121]; see the references to that chapter for an extensive bibliography 
of the literature in this area. Two other relevant references are [41], which treats tableaux 
and their connections to representation theory and geometry, and [117], which explains the 
role of symmetric polynomials in the representation theory of symmetric groups. 

The bijective proof of Theorem 9.27 is due to Bender and Knuth [8]. The algorithmic 
proof of the existence part of the fundamental theorem of symmetric polynomials (out- 
lined in Exercise 9-65) is usually attributed to Waring [124]. Some of the seminal papers by 
Robinson, Schensted, and Knuth on the RSK correspondence are [74, 111, 118]. The symme- 
try property in Theorem 9.106 was first proved by Schiitzenberger [120]; the combinatorial 
proof using shadow lines is due to Viennot [129]. 
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Abaci and Antisymmetric Polynomials 


In Chapter 9, we used combinatorial operations on tableaux to establish algebraic proper- 
ties of Schur polynomials and other symmetric polynomials. This chapter investigates the 
interplay between the combinatorics of abaci and the algebraic properties of antisymmet- 
ric polynomials. These concepts are used to prove additional facts about integer partitions 
and symmetric polynomials. In particular, we derive some formulas for expanding skew 
Schur polynomials in terms of various bases. Key results include the Jacobi~Trudi Formulas 
and the Littlewood—Richardson Rule for the Schur expansion of the product of two Schur 
polynomials. 


DS 


10.1 Abaci and Integer Partitions 


An abacus is an instrument used in ancient times for performing arithmetical calculations. 
The abacus consists of one or more runners that contain sliding beads. The following com- 
binatorial object gives a mathematical model of an abacus. 


10.1. Definition: One-Runner Abacus. An abacus with one runner is a function w : 
Z —» {0,1} such that for some m and n, w; = 1 for all i < m and w; = 0 for alli > n. We 
think of w as an infinite word ---w_2w_1wow1w2w3--- that begins with an infinite string 
of 1’s and ends with an infinite string of 0’s. Each 1 is called a bead, and each 0 is called a 
gap. Let Abc denote the set of all 1-runner abaci. An abacus w is called justified at position 
m iff w;, = 1 for alli < m and w; = 0 for all i > m. Intuitively, an abacus is justified iff 
all the beads have been pushed to the left as far as they will go. The weight of an abacus 
w, denoted wt(w), is the number of pairs i < j with w; < w; (or equivalently, w; = 0 and 
Wj = 1). 


10.2. Example. Here is a picture of a 1-runner abacus: 


0000+ 00+ +0+0+0+ es 


This picture corresponds to the mathematical abacus 
w =---111101100110101000--- , 


where the underlined 1 is wo. All positions to the left of the displayed region contain beads, 
and all positions to the right contain gaps. 

Consider the actions required to transform w into a justified abacus. We begin with the 
bead following the leftmost gap, which slides one position to the left, producing 


w’ =--+111110100110101000--- . 
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The next bead now slides into the position vacated by the previous bead, producing 

w” =---111111000110101000--- . 
The next bead moves three positions to the left to give the abacus 

w®) = ---111111100010101000--- . 


In the next three steps, the remaining beads move left by three, four, and five positions 
respectively, leading to the abacus 


w* =---111111111100000000--- , 


which is justified at position 0. If we list the number of positions that each bead moved, 
we obtain a weakly increasing sequence: 1 < 1 < 3 < 3 < 4 < 5. This sequence can be 
identified with the integer partition A = (5,4,3,3,1,1). Observe that wt(w) = 17 = |A|. 
This example generalizes as follows. 


10.3. Theorem: Partitions and Abaci. Justification of abaci defines a bijection J : 
Abc > Z x Par with inverse U : Z x Par — Abc. If J(w) = (m, A), then wt(w) = |AI. 


Proof. Given an abacus w, let n be the least integer with w, = 0 (the position of the 
leftmost gap), which exists since w begins with an infinite string of 1’s. Since w ends with 
an infinite string of 0’s, there are only finitely many j > n with w; = 1; let these indices 
be ji < jo <-++< je, where n < j,. We justify the abacus by moving the bead at position 
ji left A, = 71 — n places. Then we move the bead at position jg left Ax-1 = jo — (n + 1) 
places. (We subtract n +1 since the leftmost gap is now at position n + 1.) In general, 
at stage k we move the bead at position j, left Arzi-k = je — (n + k — 1) places. After 
moving all t beads, we have a justified abacus with the leftmost gap located at position 
n+t. Since n < 1 < jo < ++: < je, it follows that 0 < Ay < A-1 < ++: < Ai. We define 
J(w) = (n+t—1,A) where A = (Aj,..., Az). For all &, moving the bead at position 7, left 
Ar+1—-k places decreases the weight of the abacus by Az41~-~. Since a justified abacus has 
weight zero, it follows that the weight of the original abacus is precisely \, +---+ Aq = |Al. 

J is a bijection because unjustification is a two-sided inverse for J. More precisely, given 
(m, uw) € Z x Par, we create an abacus U(m, 1) as follows. Start with an abacus justified at 
position m. Move the rightmost bead to the right 44, places, then move the next bead to 
the right 2 places, and so on. This process reverses the action of J. O 


10.4. Remark: Computing U. The unjustification map U can also be computed using 
partition diagrams. We can reconstruct the bead-gap sequence in the abacus U(m, y) by 
traversing the frontier of the diagram of 4 (traveling northeast) and recording a gap (0) for 
each horizontal step and a bead (1) for each vertical step. For example, if 4 = (5, 4,3,3,1,1), 
the diagram of ju is 


and the bead-gap sequence is 01100110101. To obtain the abacus w, we prepend an infinite 
string of 1’s, append an infinite string of 0’s, and finally use m to determine which symbol 
in the resulting string is considered to be wo. It can be checked that this procedure produces 
the same abacus as the map U in the previous proof. We can also confirm that the map U 


Abaci and Antisymmetric Polynomials 431 


is weight-preserving via the following bijection between the set of cells of the diagram of pw 
and the set of pairs 7 < j with w; = 0 and w; = 1. Starting at a cell c, travel south to reach 
a horizontal edge on the frontier (encoded by some w; = 0). Travel east from c to reach a 
vertical edge on the frontier (encoded by some w; = 1 with j > i). For example, the cell in 
the second row and third column of the diagram above corresponds to the marked gap-bead 
pair in the associated abacus: 

-+-Q1100110101---. 


10.2. The Jacobi Triple Product Identity 


The Jacobi Triple Product Identity is a partition identity that has several applications in 
combinatorics and number theory. Here we give a bijective proof (due to Borcherds) of this 
formal power series identity by using cleverly chosen weights on abaci. 


10.5. Theorem: Jacobi Triple Product Identity. 


a grr ay = [[c fe uq”) age ut”) ie = a 
meZ nm=1 n=0 n=1 


Proof. Since the formal power series [[>-_,(1 — g") is invertible, it suffices to prove the 
equivalent identity 


So amen)" TT = = J] + ug”) T] 0 + 29". (10.1) 
meZ n=1 q n=1 n=0 


Let the weight of an integer m be wt(m) = q’™"+)/2u™, and let the weight of a partition 
ube ql#!. Since []°, 1/(1-—¢") = ere q'#| by Theorem 5.45, the left side of (10.1) is 


S> — wt(m) wt(u), 


(m,p)€Zx Par 


which is the generating function for the weighted set Z x Par. 

On the other hand, let us define new weights on the set Abc as follows. Given an abacus 
w, let N(w) = {i < 0: w; = 0} be the set of nonpositive positions in w not containing a 
bead, and let P(w) = {i > 0: w; = 1} be the set of positive positions in w containing a 
bead. Both N(w) and P(w) are finite sets. Define 


We can build an abacus by choosing a bead or a gap in each nonpositive position (choosing 
a bead all but finitely many times), and then choosing a bead or a gap in each positive 
position (choosing a gap all but finitely many times). The generating function for the choice 
at position i < 0 is 1+u7'q!‘l, while the generating function for the choice at position i > 0 
is 1+u'g'. By the Product Rule for Weighted Sets, the right side of (10.1) is PD cape Wt(w). 

To complete the proof, it suffices to verify that the justification bijection J : Abe > 
Z x Par is weight-preserving. Suppose J(w) = (m,y) for some abacus w. The map J 
converts w to an abacus w*, justified at position m, by || steps in which some bead 
moves one position to the left. Claim 1: The weight of the justified abacus w* is wt(m) = 
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umgr("+)/2, We prove this by considering three cases. When m = 0, N(w*) = 0 = P(w*), 
so wt(w*) = 1 = wt(0). When m > 0, N(w*) = @ and P(w*) = {1,2,...,m}, so 


wt(w*) = i a _ gree? _ wt(m). 


When m < 0, N(w*) = {0, -1, -2,...,-—(|m| — 1)} and P(w*) = 0, so 


[el goes $2+---+(|m|—1) = ymin) = ginger? _ wt(m). 


wt(w*) =u 
Claim 2: If we move one bead one step left in a given abacus y, the u-weight stays 
the same and the q-weight drops by 1. Let i be the initial position of the moved bead, 
and let y’ be the abacus obtained by moving the bead to position 7 — 1. If i > 1, then 
N(y’) = N(y) and P(y’) = (P(y)—{i}) U {7 — 1}, so wt(y’) = wt(y)/q as needed. If i < 0, 
then P(y’) = P(y) and N(y’) = (N(y)—{i — 1}) U {i} (since the N-set records positions 
of gaps), and so wt(y’) = wt(y)q!*!/q!*—!! = wt(y)/q. If i = 1, then P(y’) = P(y)—{1} and 
N(y’) = N(y)—{O}, so the total u-weight is preserved and the q-weight still drops by 1. 
To finish, use Claim 2 || times to conclude that 


wt(w) = wt(w*)q'"! = wt(m) wt(u) = wt(J(w)). Oo 


Variations of the preceding proof can be used to establish other partition identities. As 
an example, we now sketch a bijective proof of Euler’s Pentagonal Number Theorem. Unlike 
our earlier proof in §5.17, the current proof does not use an involution to cancel oppositely 
signed objects. We remark that Euler’s identity also follows by appropriately specializing 
the Jacobi Triple Product Identity. 


10.6. Euler’s Pentagonal Number Theorem. 


[[@-4") = 0C-1)'a2 4" 
n=1 keZ 
Proof. Note first that 
][@- =][@-¢) [[a-¢) [[a-¢”). 
n=1 i=1 i=1 i=1 


It therefore suffices to prove the identity 


Ta-#) [[a-#) - Sayre? Hr Il 1 = ‘ (—1)kg3lel+ (3h? —h)/2, 
. — qr" 
i=l 


i=1 i=1 keZ (kj) €Zx Par 


(10.2) 
Consider abaci w = {w3%41 : k € Z} whose positions are indexed by integers congruent 
to 1 mod 3. Define N(w) = {i < 0: %=1 (mod 3),w; = 0} and P(w) = {i>O0:i=1 
(mod 3), w; = 1}. Let sgn(w) = (—1)INMIFIP) and wt(w) = rien (w)UP(w) |é|- We can 
compute the generating function )°,,, sen(w)q™*(”) in two ways. On one hand, placing a bead 
or a gap in each negative position and each positive position leads to the double product 
on the left side of (10.2). On the other hand, justifying the abacus transforms w into a pair 
(3k —2, ) for some k € Z. As in the proof of the Jacobi Triple Product Identity, one checks 
that the justified abacus associated to a given integer k has signed weight (—1)*q(**—*)/2, 
while each of the |u| bead moves in the justification process reduces the g-weight by 3 and 
preserves the sign. So the right side of (10.2) is also the generating function for these abaci, 
completing the proof. Oo 
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10.3. Ribbons and k-Cores 


Recall the following fact about division of integers: given integers a > 0 and k > 0, there 
exist a unique quotient g and remainder r satisfying a = kq+rand0<r<k. Our next 
goal is to develop an analogous operation for dividing an integer partition jz by a positive 
integer k. The result of this operation consists of k quotient partitions together with a 
remainder partition with special properties. We begin by describing the calculation of the 
remainder, which is called a k-core. Abaci can then be used to establish the uniqueness of 
the remainder, and this leads us to the definition of the k quotient partitions. 

To motivate our construction, consider the following pictorial method for performing 
integer division. Suppose we are dividing a = 17 by k = 5, obtaining quotient gq = 3 and 
remainder r = 2. To find these answers geometrically, first draw a row of 17 boxes: 


Se a a i WT 


Now, starting at the right end, repeatedly remove strings of five consecutive cells until this 
is no longer possible. We depict this process by placing an 7 in every cell removed at stage 
4, and writing a star in every leftover cell: 


The quotient q is the number of 5-cell blocks we removed (here 3), and the remainder r is 
the number of leftover cells (here 2). This geometric procedure corresponds to the algebraic 
process of subtracting & from a repeatedly until a remainder less than k is reached. For the 
purposes of partition division, we now introduce a two-dimensional version of this strip- 
removal process. 


10.7. Definition: Ribbons. A ribbon is a skew shape that can be formed by starting at a 
given square, repeatedly moving left or down one step at a time, and including all squares 
visited in this way. A ribbon consisting of k cells is called a k-ribbon. A border ribbon of a 
partition yz is a ribbon R contained in dg() such that dg(j)—R is also a partition diagram. 


10.8. Example. Here are two examples of ribbons: 


(6,6, 4, 3)/(5, 3,2) = (7,4, 4, 4)/(3, 3, 3,2) = 


The first ribbon is a 9-ribbon and a border ribbon of (6,6, 4,3). The partition (4,3, 1) with 
diagram 


has exactly eight border ribbons, four of which begin at the cell (1,4). 


10.9. Definition: k-cores. Let k be a positive integer. An integer partition v is called a 
k-core iff no border ribbon of v is a k-ribbon. 


For example, (4,3,1) is a 5-core, but not a k-core for any k < 5. 

Suppose p is any partition and k is a positive integer. If has no border ribbons of size k, 
then p is a k-core. Otherwise, we can pick one such ribbon and remove it from the diagram 
of yz to obtain a smaller partition diagram. We can iterate this process, repeatedly removing 
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a border k-ribbon from the current partition diagram until this is no longer possible. Since 
the number of cells decreases at each step, the process must eventually terminate. The final 
partition v (which may be empty) must be a k-core. This partition is the remainder when 
wt is divided by k. 


10.10. Example. Consider the partition yw = (5,5,4,3) with diagram 


Let us divide yp by & = 4. We record the removal of border 4-ribbons by entering an 7 in 
each square that is removed at stage 7. Any leftover squares at the end are marked by a 
star. One possible removal sequence is the following: 


Another possible sequence is: 


Notice that the three 4-ribbons removed are different, but the final 4-core is the same, 
namely v = (4,1). 


We can use abaci to show that the k-core obtained when dividing jz by & depends only 
on pz and k, not on the choice of which border k-ribbon is removed at each stage. 


10.11. Definition: Abacus with s Runners. A k-runner abacus is an ordered k-tuple 
of abaci. The set of all such objects is denoted Abc". 


10.12. Theorem: Decimation of Abaci. For each k > 1, there are mutually inverse 
bijections D; : Abc + Abc* (decimation) and I, : Abc” + Abc (interleaving). 


Proof. Given w = (w;: i € Z) € Abc, set Dy(w) = (w°, w!,..., w*-1), where 
w" = (wgk+r : g € Z) for allr withO<r<k. 


Thus, the abacus w’ is obtained by reading every kth symbol in the original abacus (in 
both directions), starting at position r. It is routine to check that each w" is an abacus. 
The inverse map interleaves these abaci to reconstruct the original 1-runner abacus. More 


precisely, given v = (v°, v!,...,v"-1) € Abc”, let I,(v) = z where zgr4r = vy for all g,r € Z 
with 0 <r <k. One readily checks that J, (v) is an abacus and that D;, and J, are two-sided 
inverses. O 


By computing D,(U(—1, 4)), we can convert any partition into a k-runner abacus. We 

now show that moving one bead left one step on a k-runner abacus corresponds to removing 
a border k-ribbon from the associated partition diagram. 
10.13. Theorem: Bead Motions Encode Ribbon Removals. Suppose a partition 
pt is encoded by a k-runner abacus w = (w°,w!,...,w*7!). Suppose that v is a k-runner 
abacus obtained from w by changing one substring ...01... to ...10... in some w’. Then 
the partition v associated to v can be obtained by removing one border k-ribbon from wp. 
Moreover, there is a bijection between the set of removable border k-ribbons in yz and the 
set of occurrences of the substring 01 in the components of w. 
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Proof. Recall from Remark 10.4 that we can encode the frontier of a partition 4 by writing 
a 0 (gap) for each horizontal step and writing a 1 (bead) for each vertical step. The word 
so obtained (when preceded by 1’s and followed by 0’s) is a 1-runner abacus associated to 
this partition, and w is the k-decimation of this abacus. 

Let R be a border k-ribbon of jz. The southeast border of R, which is part of the frontier 
of pu, gets encoded as a string of k + 1 symbols ro,71,...,7~, where ro = 0 and rz, = 1. For 
instance, the first ribbon in Example 10.8 has southeast border 0001010011. Note that the 
northwest border of this ribbon is encoded by 1001010010, which is the string obtained by 
interchanging the initial 0 and the terminal 1 in the original string. The following picture 
indicates why this property holds for general k-ribbons. 


0 
q 
of 
1 1 
en) on 
1 ie 1 


Since ro = 0 and rx, = 1 are separated by & positions in the 1-runner abacus, these two 
symbols map to two consecutive symbols 01 on one of the runners in the k-runner abacus 
for w. Changing these symbols to 10 will interchange rp and rx in the original word. Hence, 
the portion of the frontier of j: consisting of the southeast border of R gets replaced by the 
northwest border of R. So, this bead motion transforms jz into the partition v obtained by 
removing the ribbon R. 

Conversely, each substring 01 in the k-runner abacus for js corresponds to a unique pair 
of symbols 0---1 in the 1-runner abacus that are k positions apart. This pair corresponds 
to a unique pair of steps H---V on the frontier that are k steps apart. Finally, this pair 
of steps corresponds to a unique removable border k-ribbon of yw. So the map from these 
ribbons to occurrences of 01 on the runners of w is a bijection. O 


10.14. Example. Let us convert the partition = (5,5,4,3) from Example 10.10 to a 
4-runner abacus. First, the l-runner abacus U(—1, jz) is 


--+111000101011000--- . 


Decimating by 4 produces the following 4-runner abacus: 


Note that the bead-gap pattern in this abacus can be read directly from the frontier of 
by filling in the runners one column at a time, working from left to right. For the purposes 
of ribbon removal, we may decide arbitrarily where to place the gap corresponding to the 
first step of the frontier; this decision determines the integer m in the expression U(m, j.). 

Now let us start removing ribbons. Suppose we push the rightmost bead on the top 
runner left one position, producing the following abacus: 
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Reading down columns to recover the frontier of the new partition, we obtain the partition 
vy = (4,3,3,3). We get v from y by removing one border 4-ribbon, as shown here: 


The new partition is (4,3,2), which arises by removing one border 4-ribbon from v, as 
shown here: 


Finally, we push the rightmost bead on the second runner left one position to get the 
following abacus: 


The associated partition is (4,1), as shown here: 


At this point, all runners on the abacus are justified, so no further bead motion is possible. 
This reflects the fact that we can remove no further border 4-ribbons from the 4-core (4, 1). 

Now return to the original partition j: and the associated 4-runner abacus. Suppose we 
start by moving the bead on the second runner left one position, producing the following 
abacus: 
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This corresponds to removing a different border 4-ribbon from yu: 


Observe that ys has exactly two removable border 4-ribbons, whereas the 4-runner abacus 
for ys has exactly two movable beads, in accordance with the last assertion of Theorem 10.13. 


10.15. Example. Consider the following 3-runner abacus: 


We count six beads on this abacus that can be moved one position left without bumping 
into another bead. Accordingly, we expect the associated partition to have exactly six 
removable border 3-ribbons. This is indeed the case, as shown below (we have marked the 
southwestmost cell of each removable 3-ribbon with an asterisk): 


Now we can prove that the k-core obtained from a partition pw by repeated removal of 
border k-ribbons is uniquely determined by yp and k. 


10.16. Theorem: Uniqueness of k-cores. Suppose yu is an integer partition and k > 1 is 
an integer. There is exactly one k-core p obtainable from yz by repeatedly removing border 
k-ribbons. We call p the k-core of |. 


Proof. Let w be a fixed k-runner abacus associated to yu, say w = D,(U(—1, 12)) for definite- 
ness. As we have seen, a particular sequence of ribbon-removal operations on js corresponds 
to a particular sequence of bead motions on w. The operations on ju terminate when we 
reach a k-core, whereas the corresponding operations on w terminate when the beads on 
all runners of w have been justified. Now p is uniquely determined by the justified k-runner 
abacus by applying J; and then J. The key observation is that the justified abacus obtained 
from w does not depend on the order in which individual bead moves are made. Thus, the 
k-core p does not depend on the order in which border ribbons are removed from ju. O 
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10.17. Example. The theorem shows that we can calculate the k-core of uw by jus- 
tifying any k-runner abacus associated to jz. For example, consider the partition = 
(10, 10, 10,8,8,8, 7,4) from Example 10.15. Justifying the 3-runner abacus in that example 
produces the following abacus: 


We find that the 3-core of y is (1, 1). 
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10.4 k-Quotients of a Partition 


Each runner of a k-runner abacus can be regarded as a 1-runner abacus, which corresponds 
(under the justification bijection J) to an element of Z x Par. This observation leads to the 
definition of the k-quotients of a partition. 


10.18. Definition: k-Quotients of a Partition. Let yw be a partition and k > 1 an 
integer. Consider the k-runner abacus (w®, w!,...,w*7!) = Dy(U(-1, p)). Write J(w*) = 
(m;,v") for 0 <i<k. The partitions appearing in the k-tuple (v°,v',...,v*~+) are called 
the k-quotients of ju. 


10.19. Example. Let yw = (5,5,4,3). In Example 10.14, we computed the 4-runner abacus 
Dy(U(-1, 1): 


Justifying each runner and converting the resulting 4-runner abacus back to a partition 
produces the 4-core of 4, namely (4,1). On the other hand, converting each runner to a 
separate partition produces the 4-tuple of 4-quotients of jz, namely 


(v°, v',v”, Vv?) _ ((2), (1), (0), (0)). 


10.20. Example. Consider the partition 4 = (10,10, 10,8,8,8,7,4) from Example 10.15. 
We compute 
U(—1, 1) = --+111100001000101110011100:-- . 


Decimation by 3 produces the 3-runner abacus shown here: 
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Justifying each runner shows that the 3-core of w is p = (1,1). On the other hand, by 
regarding each runner separately as a partition, we obtain the 3-tuple of 3-quotients of pu: 


(eo = (8, 2, 2), (4, 4),.5, 2, 1). 


Observe that |u| = 65 =2+3-(7+8+6) = |p| + 3|v°| + 3|v4| + 3|v?|. 

Now consider what would have happened if we had performed similar computations on 
the 3-runner abacus for y displayed in Example 10.15, which is D3(U(0, )). The 3-core 
coming from this abacus is still (1,1), but converting each runner to a partition produces 
the following 3-tuple: 

((3, 2, 1), (3, 2, 2), (4, 4)). 


This 3-tuple arises by cyclically shifting the previous 3-tuple one step to the right. One 
can check that this holds in general: if the k-quotients for p are (v°,...,v*~1), then the 


k-quotients computed using D,(U(m, 11) are (vk~™,...,v*-!,v°,v!,...), where m’ is the 
integer remainder when m + 1 is divided by k. 
10.21. Remark. Here is a way to compute (w®,w!,...,w*-!) = D,(U(—1,)) from the 


frontier of « without writing the intermediate abacus U(—1, 4). Draw a line of slope —1 
starting at the northwest corner of the diagram of jw. The first step on the frontier of js lying 
northeast of this line corresponds to position 0 of the zeroth runner w®. The next step is 
position 0 on w!, and so on. The step just southwest of the diagonal line is position —1 on 
w*-! the previous step is position —1 on w*~?, and so on. To see that this works, it must 
be checked that the first step northeast of the diagonal line gets mapped to position 0 on 
the 1-runner abacus U(—1, 11). 


10.22. Theorem: Partition Division. Let Core(k) be the set of all k-cores. There is a 
bijection 
Ax : Par — Core(k) x Par® 


k-1 


such that A,(w) = (p,v°,...,v*~1), where p is the k-core of u and v®,...,v*~1 are the 


k-quotients of ju. We have |u| = |p| + aa \v*|. 


Proof. We have already seen that the function A, is well-defined and maps into the 
stated codomain. To see that this function is a bijection, we describe its inverse. Given 
(p,v°,...,v*-!) € Core(k) x Par, first compute the k-runner abacus (w®,...,w*71) = 
D,(U(-1, p)). Each w’ is a justified 1-runner abacus because p is a k-core; say w? is jus- 
tified at position m;. Now replace each w’ by v' = U(mj,v"). Finally, let jz be the unique 


partition satisfying J(I;,(v°,...,v*~1)) = (—1,). This construction reverses the one used 
to produce k-cores and k-quotients, so yu is the unique partition mapped to (p,v°,...,v*~1) 
by Ar. 


To prove the formula for |u|, consider the bead movements used to justify the runners 
of the k-runner abacus D;,(U(—1, )). On one hand, every time we move a bead one step 
left on this abacus, the area of 4: drops by & since the bead motion removes one border 
k-ribbon. When we finish moving all the beads, we are left with the k-core p. It follows that 
\4| = |p| + km where m is the total number of bead motions on all k runners. On the other 
hand, for 0 <i < k, let m; be the number of times we move a bead one step left on runner 
i. Then m = mo +m 1 +:--+mp~1, whereas m; = |p'| by Theorem 10.3. Substituting these 
expressions into |ju| = |p| + km gives the stated formula for ||. Oo 
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10.5 k-Quotients and Hooks 


We close our discussion of partition division by describing a way to compute the k-quotients 
of yw directly from the diagram of 4, without recourse to abaci. We need the following device 
for labeling cells of dg(jz) and steps on the frontier of « by integers in {0,1,...,4 — 1}. 


10.23. Definition: Content and k-Content. Consider a partition diagram for 4, drawn 
with the longest row on top. Introduce a coordinate system so that the northwest corner of 
the diagram is (0,0) and (i, 7) is located i steps south and j steps east of the origin. The 
content of the point (i, 7) is c(t,7) = 7 — 7%. The content of a cell in the diagram of jz is the 
content of its southeast corner. The content of a frontier step from (i, 7) to (i,j +1) is 7-7. 
The content of a frontier step from (i,7) to («— 1,7) is 7 — 7. If z is a lattice point, cell, or 
step in the diagram, then the k-content c,(z) is the unique value r € {0,1,...,4 —1} such 
that c,(z) =r (mod k). 


10.24. Example. The left side of Figure 10.1 shows the diagram of the partition = 
(10,10, 10,8,8,8, 7,4) with each cell and frontier step labeled by its content. On the right 
side of the figure, each cell and step is labeled by its 3-content. 


0 2/3/4/5|6|7/8|9 }j9 0/1;/2/o0/1/2/0/1)2/0 
-1| 0 2/3/4/5/6]7/8)8 2/0/1/2}/o/1;2]/o0/;/1]22 
-2}-1/0/1]2/3)/4]/5/6/7]7 1/2 /0/1]/2/o0/1]2]/o/14 
-3|-2)-14)0/1/2/3]/4h° ° 1 \2lelilalelan* ° 
-4| -3} -2)-1)0 | 1) 2/3 Js 2/0 /1/2}0/1/2]0 |o 
—5| -4| -3) -2}-1] 0/1] 2 [2 1/2 /O0/1]/2]/o0/1]/2h 
-6| -5| -4) -3) -2| -1] 0 |, 7 Ot 2/0. [a Le ho). 
-7| -6| -5| -4|_,3 ~2 1 Flot ie, = 2 
=3 27 <6 =5 1 2 #0 1 

content 3-content 


FIGURE 10.1 
Content and 3-content of cells and steps. 


Given a cell in the diagram of 44, we obtain an associated pair of steps on the frontier 
of y by traveling due south or due east from the cell in question. Suppose we mark all 
cells whose associated steps both have content zero. Then erase all other cells and shift the 
marked cells up and left as far as possible. The following diagram results: 


This partition (3,2,2) is precisely the zeroth 3-quotient of y. Similarly, marking the cells 
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whose associated steps both have 3-content equal to 1 produces the next 3-quotient of pu: 


Finally, marking the cells whose associated steps both have 3-content equal to 2 produces 
the last 3-quotient of ju: 
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In general, to obtain the ith k-quotient v* of 4 from the diagram of ju, label each row 
(respectively column) of the diagram with the k-content of the frontier step located in that 
row (respectively column). Erase all rows and columns not labeled i. The number of cells 
remaining in the jth unerased row is the jth part of v’. To see why this works, recall that 
the cells of v’ correspond bijectively to the pairs of symbols 0---1 on the ith runner of 
the k-runner abacus for yw. In turn, these pairs correspond to pairs of symbols w, = 0, 
wz, = 1 on the 1-runner abacus for ys where s < t and s =i =t (mod k). The symbols in 
positions congruent to i mod k come from the steps on the frontier of ~ whose k-content 
is 7. Finally, the relevant pairs of steps on the frontier correspond to the unerased cells in 
the construction described above. Composing all these bijections, we see that the cells of v 
are in one-to-one correspondence with the unerased cells of the construction. Furthermore, 
cells in row j of v’ are mapped onto the unerased cells in the jth unerased row of y. It 
follows that the construction at the beginning of this paragraph does indeed produce the 
k-quotient v*. 


DT 


10.6 Antisymmetric Polynomials 


We now define antisymmetric polynomials, which form a vector space similar to the vector 
space of symmetric polynomials studied in Chapter 9. 


10.25. Definition: Antisymmetric Polynomials. A polynomial f € R[x,...,xy] is 
antisymmetric iff for all w € Sy, 


Fl tetas Uw (2)r-es ,Lw(N)) = sgn(w) f (x1, T2,++- ,UN). 


More generally, everything said below holds for antisymmetric polynomials with coefficients 
in any field kK containing Q. 


10.26. Remark. The group Sy acts on the set {x1,...,an} via we x; = 2%, (i) for all 
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w € Sy and i between 1 and N. This action extends to an action of Sy on Ria,...,¢n] 
given by we f = f(@w1),---,2w(n)) for all w € Sy and f € R[7,..., 2]. The polynomial 
f is antisymmetric iff we f = sgn(w)f for all w € Sy. It suffices to check this condition 
when w is a basic transposition (1,7 + 1). For, any w € Sy can be written as a product of 
basic transpositions, say w = tit2---t,. If we have checked that t; e f = sgn(t;)f for all i 
between 1 and N — 1, then 


we f=tle---e(t, ef) =(—1)* f =sgn(w)f. 


We can restate this result by saying that f € R[a1,...,ay] is antisymmetric iff for all i in 
the range 1 <i< N, 


f (21, - wey Lj41,%j,-- .,tNn) = —f(21,. 105 Ui, Ti41,--- , IN). 


10.27. Example. The polynomial f(x1,...,72N) = Hhejcrew (2s — x) is antisymmetric. 
To check this, consider what happens to the factors in the product when we interchange 2; 
and x;+1. Factors not involving 2; or 2;+1 are unchanged; factors of the form (x; — x.) with 
k >%t+1 get interchanged with factors of the form (#41 — 2%); and factors of the form 
(x; —2;) with j <i get interchanged with factors of the form (x; —;+1). Finally, the factor 
(a; — 41) becomes (a;41 — 2) = —(a; — ti41). Thus, (¢,i+1)ef =-—f forl<i<N, 
proving antisymmetry of f. 
The polynomial f = [] je , (£7 — ve) in this example is the Vandermonde determinant 


N 


deta) <i,j<n = sen(w) [leva 


weSn i=l 
(see §12.11 for a combinatorial proof of this assertion). 


We can use determinants similar to the Vandermonde determinant to manufacture ad- 
ditional examples of antisymmetric polynomials. Here we need the combinatorial definition 
of determinants, which is covered in §12.9. 


10.28. Definition: Monomial Antisymmetric Polynomials. Let pw = (ui > pe > 
+++ > pn) be a strictly decreasing sequence of N nonnegative integers. Define a polynomial 
a,,(%1,...,¢n) by the formula 


N 
a,(21,-..,2n) = det[x"jl<aj<en = > sgn(w) [=e : 


weSn w=1 
We call a, a monomial antisymmetric polynomial indexed by pL. 


To see that a, really is antisymmetric, note that interchanging x, and xz41 has the 
effect of interchanging columns k and k + 1 in the determinant defining a,. This column 
switch changes the sign of a,, (for a proof of this determinant fact, see Theorem 12.50). 


10.29. Example. Let N = 3 and w = (5,4,2). Then 


Oy (©1,%2,%3) = +a aha? + tortor a xrk as = xtaeoa2 = nine = cet 
As the previous example shows, a,(x1,...,¢y) is a sum of N! distinct monomials 
obtained by rearranging the subscripts (or equivalently, the exponents) in the monomial 
vi a5? ---ak'. Each monomial appears in the sum with sign +1 or —1, where the sign of 
xj’ --- aX" depends on the parity of the number of basic transpositions needed to transform 
the sequence (e1,..., en) to the sorted sequence (41,..., 4). It follows from these remarks 


that a, is a nonzero homogeneous polynomial of degree |u| = 1 +--- + pn. 
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10.30. Definition: 6(N). For each N > 1, let d6(N) = (N —1,N —2,...,2,1,0). 


The strictly decreasing sequences 4 = ({11 > jig > ++: > fn) correspond bijectively to 
the weakly decreasing sequences A = (Ay > Ap > --: > An) via the maps sending pz to 
p.— d(N) and A to A+ d(N). It follows that each polynomial a, can be written a)+5) 
for a unique partition A € Pary. This indexing scheme is used frequently below. Note that 
when A = (0,...,0), we have p = 6(N) and as.ny = []y<jcpen (ji — 2x) by Example 10.27. 
Observe that as.) is a homogeneous polynomial of degree N(N — 1)/2 = (7). 


10.31. Definition: Spaces of Antisymmetric Polynomials. Let Ay be the set of all 
antisymmetric polynomials in R[x1,...,2x]. Let A}, consist of those polynomials in Ay 
that are homogeneous of degree n, together with the zero polynomial. 


It can be verified that Ay is a vector subspace of R{w1,...,2y], and each Af, is a 
subspace of Ay. We now exhibit bases for these vector spaces involving monomial antisym- 
metric polynomials. We use the notation Par, (n) to denote the set of all partitions of n 
into N distinct nonnegative parts, and let Par{y = Un>(¥) Par4(n). 


10.32. Theorem: Monomial Basis for A%,. If n < (Ci then Ah, = {0}. If n > CY), 
then 
(ay: 1 € Pash (n)} = {aysacyy 0 € Pary(n— (2 ))} 
is a basis of the real vector space Ay. Hence, the collection 
{ay : 4 € Pary} = {a,45(w : A € Pary} 
is a basis of Ay. 


Proof. Suppose e = (e€1,...,e€N) is any exponent sequence, f € Ay is an arbitrary an- 
tisymmetric polynomial, and w € Sy. Let c be the coefficient of x®° = x{!---aQY in f, 
so 


f =cap---a}¥ + other terms. 


Acting by w, we see that 


sen(w)f=wef = CEeN1) DUN) + other terms 
cx,” |)...” + other terms 
cx*° + other terms, 
where w * € = (€,-1(1),-++;€w-1(w))- In other words, writing f|x« for the coefficient of x° 


in f, we have f|xuwxe = sgn(w)(f xe). 

Let us apply this fact to an exponent sequence e such that e; = e; for some i # j. 
Let w = (i,j), so that w*e = e and sgn(w) = —1. It follows that c = —c, soc = 0. 
This means that no antisymmetric polynomial contains any monomial with a repeated 
value in its exponent vector. In particular, the smallest possible degree of a monomial that 
can appear with nonzero coefficient in any antisymmetric polynomial in N variables is 
04+142+---4+(N-l)= ety This proves that Ak, = {0} for n < eM 

For the second assertion, recall that the map sending A to A + 6(NV) is a bijection from 
Pary(n — (¥)) to Par§y(n). So we need only show that {a,,:  € Par‘y(n)} is a basis for 
A’,. To show that this set spans A‘, fix f € AX,. By the previous paragraph, we can write 
f = >. Cax® where we sum over all sequences (a),...,an) € ZX, with distinct entries 
summing to n, and each c, is a real scalar. We claim ~ 


i > Cydy. 


vePard (n) 
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To prove this, we compare the coefficient of x* on each side. Choose fu € Par4,(n) and 
w € Sy such that w * w = a (p consists of the entries of a sorted into decreasing order). 
By the first paragraph of this proof, 


Flxe = flaw = sen(w)(flxx) = sgn(w)cp. 


On the other side, a,|x« = 0 for all v 4 p (since no rearrangement of v equals ~w). For v = p, 
antisymmetry gives a@,,|x0 = sgn(w)(@y|x«) = sgn(w). Multiplying by c, and summing over 
all v, the coefficient of x* in }7, cya, is sgn(w)c,, as needed. 

To prove linear independence, suppose 


0= LS d ad, with all d, ER. 


véPard,(n) 


For a fixed pp € Par4;(n), a, is the only polynomial among the a,’s that involves the 
monomial x". Extracting this coefficient on both sides of the given equation, we find that 
0=d,-1=d,. Since ps was arbitrary, all d,, are zero. O 


The next result explains the relationship between the various vector spaces A‘, and A’. 


10.33. Theorem: Symmetric and Antisymmetric Polynomials. For each k > 0, 


N 
the vector spaces A‘, and Ak+(2) are isomorphic, as are the vector spaces Ay and Ay. In 
each of these cases, an isomorphism is given by the formula M(f) = f - as.) for f € An, 
and the inverse isomorphism sends g € Ay to g/asiy)- In particular, every antisymmetric 
polynomial in N variables is divisible by the polynomial as,y). 


Proof. Fix k > 0, and consider the map My, : ye — R{w1,...,2y] defined by ;,(f) = 
f+ ag.ny for f € AX. First, f is homogeneous of degree k and as(n) is homogeneous of 
degree (4), so My(f) is homogeneous of degree k + (4). Second, M,(f) is antisymmetric, 
since for any w € Sy, 


we (fasny) = (we f) - (we asin) = f : (sgn(w)ascny) = sgn(w)(fasvny). 
So M;, takes values in the codomain Ay . Third, one immediately verifies that M;, is a 
linear map. Fourth, the kernel of this linear map is zero: M;(f) = 0 implies f - asi) = 0, 
which implies f = 0 since ay) is a nonzero element of the integral domain R[x1,..., ay]. 
So M is injective. Fifth, !@ must also be surjective since its domain and codomain are 
vector spaces having the same finite dimension | Pary(k)|. So each M; is an isomorphism. 


N 
Since Ay (respectively Ay) is the direct sum of subspaces A‘, (respectively Ab+(2)), it 
follows that Ay and Ay are isomorphic as well. Finally, surjectivity of the map sending 
f to fas(~) means that every antisymmetric polynomial g has the form fas) for some 
symmetric polynomial f. So g is divisible by agi) in R[x1,..., ay]. 


10.34. Remark. Suppose we apply the inverse of the isomorphism My; to the basis 
k N 

{4,45(N) : A € Pary(k)} of ACG) We obtain a basis {a)+45(w)/a@s(n) : A € Parn(k)} 

of AS. It turns out that @)+5()/a@5(N) is none other than the Schur symmetric polynomial 

8\(@1,..-,%Nn). To prove this fact and other properties of antisymmetric polynomials, we 

use the labeled abaci introduced below. 
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10.7 Labeled Abaci 


Given a sequence of distinct nonnegative integers 4 = (t41 > fg > +++ > Ln), recall that 
the monomial antisymmetric polynomial indexed by p is defined by 


N 
Gn(ti;..,2n)= >, sentw) || ah, 
i=1 


weSn 


The next definition introduces a set of signed weighted combinatorial objects to model this 
formula. 


10.35. Definition: Labeled Abaci. A labeled abacus with N beads is a word v = (0; : 
i > 0) such that each symbol 1,...,.N appears exactly once in v, and all other symbols in 
v are zero. We think of the indices 7 as positions on an abacus containing one runner that 
extends to infinity in the positive direction. When vu; = 0, there is a gap at position 7 on the 
abacus; when v; = j > 0, there is a bead labeled 7 at position 7. The weight of the abacus 


v is 
wt(v) = I] Dies 


i: u;>0 


So if bead 7 is located at position 7, this bead contributes a factor of a to the weight. 

We can encode a labeled abacus by specifying the positions occupied by the beads and 
the ordering of the bead labels. Formally, define pos(v) = (ui > U2 >-+:: > pn) to be 
the list of indices 7 such that v; > 0, written in decreasing order. Then define w(v) = 
(Up, ;Upyos+++)Upy) € Sn. We define the sign of v to be the sign of the permutation w(v), 


which is (—1)i"%()), Let LAbc be the set of all labeled abaci, and for each pu € Par4,, let 
LAbc(w) = {v € LAbc : pos(v) = p}. 


For each fixed yp € Par%,, there is a bijection between LAbc(j:) and Sy sending v € 
LAbc() to w(v) € Sy. Furthermore, an abacus v € LAbc(j) has sign sgn(w(v)) and 
weight [[®, 2M... So 


wv)i* 


N 
:y sgn(v) wt(v) = » sgn(w) II wii) = Gy (%1,...,0N). 


veELAbc(p) wesn 


10.36. Example. Let N = 3 and v = (5,4, 2). Earlier, we computed 


ay (1, £2, 03) = taPahx3 AF rireas =P £7 L923 = Li Lars = waka _ LpLoL3. 
The six terms in this polynomial come from the six labeled abaci in LAbc(v) shown in 
Figure 10.2. Observe that we read labels from right to left in v to obtain the permutation 
w(v). This is necessary so that the leading term x}! --- a’ will correspond to the identity 
permutation and have positive sign. 


Informally, we justify a labeled abacus v € LAbc(,) by moving all beads to the left as far 
as they will go. This produces a justified labeled abacus J(v) = (wn,...,W2,W1,0,0,...) € 
LAbc(d(N)), where (wy,...,w1) = w(v). To recover v from J(v), first write = A+ d(N) 
for some X € Pary. Move the rightmost bead (labeled w,) to the right A; positions from 
position N — 1 to position N — 1+ A, = py. Then move the next bead (labeled w2) to the 
right Az positions from position N — 2 to position N — 2+ Az = pe, and so on. 
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n+ D+OOee* J++ @Oee* 
FIGURE 10.2 


Labeled abaci. 


10.8 The Pieri Rule for p;, 


It is routine to check that the product of an antisymmetric polynomial and a symmetric 
polynomial is an antisymmetric polynomial, so such a product can be written as a linear 
combination of the monomial antisymmetric polynomials. In the next few sections, we derive 
several Pieri-type rules for expressing a product a)45()g (where g is symmetric) in terms 


of the a,. We begin by considering the case where g = pz(t1,...,tN) = pee ak is a 
power-sum symmetric polynomial. 
We know a)+5(n)(%1,-..,@N) is asum of signed terms, each of which represents a labeled 


abacus with beads in positions given by w = A+ 6(NV). If we multiply some term in this 
sum by x*, what happens to the associated abacus? Recalling that the power of 2; tells us 
where bead 7 is located, we see that this multiplication should move bead 7 to the right k 
positions. This bead motion occurs all at once, not one step at a time, so bead 7 is allowed 
to jump over any beads between its original position and its destination. However, there is 
a problem if the new position for bead 7 already contains a bead. In the proofs below, we 
will see that two objects of opposite sign cancel whenever a bead collision like this occurs. 
If there is no collision, the motion of bead 7 produces a new labeled abacus whose x;-weight 
has increased by k. However, the sign of the new abacus (compared to the original) depends 
on the parity of the number of beads that bead i jumps over when it moves to its new 
position. 

To visualize these ideas more conveniently, we decimate our labeled abacus to obtain a 
labeled abacus with k runners. Formally, the k-decimation of the labeled abacus v = (v; : 
j = 0) € LAbe(A+4(N)) is the k-tuple (v°,v',..., v1), where uv? = ugrtr- Moving a bead 
from position j to position 7 + & on the original abacus corresponds to moving a bead one 
position along its runner on the k-runner abacus. If there is already a bead in position 7 +k, 
we say that this bead move causes a bead collision. Otherwise, the bead motion produces a 
new labeled abacus in LAbc(v + 6(N)), for some v € Pary. By ignoring the labels in the 
decimated abacus, we see that v arises from by adding one k-ribbon at the border. The 
shape of this ribbon determines the sign change caused by the bead move, as illustrated in 
the following example. 


10.37. Example. Take N = 6, k = 4, A = (3,3,2,0,0,0), and w = A+ 6(6) = 
(8, 7,5,2,1,0). Consider the following labeled abacus v in LAbc(,1): 
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0123 4 5 67 8 9 


(2)(4){1)-0—-e(5)-0{6)(3 )-e 


This abacus has weight «7a¢9ar$x}x2a% and sign sen(3,6,5,1,4,2) = (—1)'° = +1. Decima- 
tion by 4 produces the following 4-runner abacus: 


Suppose we move bead 1 four positions to the right in the original abacus, from position 2 
to position 6: 


012 3 4 5 6 7 8 9 


24) SOE) 


The new abacus has weight z$z92$ala2r{ = wt(v)at and sign sgn(3, 6, 1,5,4,2) = (—1)9 = 
—1. The change in weight occurs since bead 1 moved four positions to the right. The change 
in sign occurs since bead 1 passed one other bead (bead 5) to reach its new position, and one 
basic transposition is needed to transform the permutation 3,6,5,1,4,2 into 3,6,1,5, 4,2. 


The decimation of the new abacus looks like: 


This abacus is in LAbc(v) = LAbc(a + 6(6)), where vy = (8,7,6,5,1,0) and a = 
(3, 3, 3,3,0,0). Compare the diagrams of the partitions \ and a: 


We obtain a from A by adding a new border 4-ribbon. To go from » to a, we change 
part of the frontier of \ from NEENE (where the first N step corresponds to bead 1) to 
EEENN (where the last N step corresponds to bead 1). There is one other N in this string, 
corresponding to the one bead (labeled 5) that bead 1 passes when it moves to position 6. 
Thus the number of passed beads (1 in this example) is one less than the number of rows 
occupied by the new border ribbon (2 in this example). 

Let us return to the original abacus v and move bead 5 four positions, from position 5 
to position 9: 
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012 3 4 5 6 7 8 9 


(2)(4)-(1.)-0_e—e— (6 (3 (5) 


This abacus has weight 2?23.a$xia226 = wt(v)aé and sign sgn(5, 3,6, 1,4, 2) = (-1)!° = +1. 


Note that the sign is unchanged since two basic transpositions are required to change the 
permutation 3,6,5,1,4,2 into 5,3,6,1,4,2. The decimation of the new abacus is: 


= (2 

1=0 |2)-e-(3)-0-e 
r=1 |(4)-0—6 )-e-e 
r= eee 
1=3 }e-(6)-e—e—e 


This abacus lies in LAbe(@ + 6(6)) where 6 = (4,4,4,0,0,0). The diagram of 8 arises by 
adding a border 4-ribbon to the diagram of A: 


This time the frontier changed from ... NENNE...(where the first N is bead 5) to 
... EENNN... (where the last N is bead 5). The moved bead passed two other beads (beads 
3 and 6), which is one less than the number of rows in the new ribbon (three). In general, 
the number of passed beads is one less than the number of N’s in the frontier substring 
associated to the added ribbon. So the number of passed beads is one less than the number 
of rows in the added ribbon. 

Finally, consider what would happen if we tried to move bead 4 (in the original abacus) 
four positions to the right. A collision occurs with bead 5, so this move is impossible. Now 
consider the labeled abacus v’ obtained by interchanging the labels 4 and 5 in v: 


012 3 4 5 6 7 8 9 


1)-e—e—(4)-e-(6)(3)- 


Moving bead 5 four positions to the right in v’ causes a bead collision with bead 4. Notice 
that sen(v’) = — sgn(v) since [3, 6, 4, 1,5, 2] = (4, 5)o[8, 6,5, 1, 4, 2]. Also note that wt(v)aq = 
wt(v’)aé; this equality is valid precisely because of the bead collisions. The abaci v and v’ 
are examples of a matched pair of oppositely signed objects that cancel in the proof of the 
Pieri Rule given below. 


The observations in the last example motivate the following definition. 


10.38. Definition: Spin and Sign of Ribbons. The spin of a ribbon R, denoted 
spin(R), is one less than the number of rows occupied by the ribbon. The sign of R is 
sen(R) = (—1)Pim(®), 


We now have all the combinatorial ingredients needed to prove the Pieri Rule for mul- 
tiplication by a power-sum polynomial. 
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10.39. The Antisymmetric Pieri Rule for p;,. For all \ € Pary and all k > 1, 


4)45(N)(@1,---,n)De(21,..-,2N) = ‘> sen(R)ag+s(n)(@1,---,¢N). 
BeParn: 
B/X is a k-ribbon R 


Proof. Let X be the set of pairs (v,7), where v € LAbc(A+4d(N)) and 1 <i < N. For (v,i) € 
X, set sgn(v,i) = sgn(v) and wt(v,i) = wt(v)a¥. Then ay45.wype = Dove x sgn(z) wt(z). 
We introduce a weight-preserving, sign-reversing involution J on X. Given (v,7) in X, try to 
move bead 7 to the right k positions in v. If this move causes a bead collision with bead 7, let 
v' be v with beads i and j switched, and set I(v,i) = (v’, 7). Otherwise, set I(v,i) = (v, 7). 
It can be verified that J is an involution. 

Consider the case where I(v,i) = (v’,j) 4 (v,i). Since the label permutation w(v’) is 
obtained from w(v) by multiplying by the basic transposition (7,7), sgn(v’, 7) = sgn(v’) = 


—sgn(v) = —sgn(v,i). The weight of v must have the form con --+ because of the 
bead collision, so wt(v’) = oe, ---. Tt follows that wt(v,i) = wt(v)a* = a = 


wt(v' at = wt(v’,7). Thus, I is a weight-preserving, sign-reversing map. 

Now consider a fixed point (v,i) of J. Let v* be the abacus obtained from v by moving 
bead i to the right k positions, so wt(v*) = wt(v)a* = wt(v,i). Since the unlabeled k- 
runner abacus for v* arises from the unlabeled k-runner abacus for v by moving one bead 
one step along its runner, it follows that v* € LAbc(6 + 6(N)) for a unique 6 € Pary 
such that R = 6/2 is a k-ribbon. As explained earlier, sgn(v*) differs from sgn(v) by 
sen(R) = (—1)8i"®), which is the number of beads that bead i passes over when it moves. 
Conversely, any abacus y counted by ag45:) (for some shape 6 as above) arises from a 
unique fixed point (v,7) € X, since the moved bead i is uniquely determined by the shapes 
A and 8, and v is determined from y by moving the bead 7 back & positions. These remarks 
show that the sum appearing on the right side of the theorem is the generating function for 


the fixed point set of J, which completes the proof. oO 
10.40. Example. When N = 6, we calculate 
@(3,3,2)+6(6)P4 = —4(3,3,3,3)+5(6) + @(4,4,4)+6(6) — @(6,4,2)+5(6) + @(7,3,2)+6(6) + @(3,3,2,2,1,1)+6(6) 


by adding border 4-ribbons to the shape (3, 3,2), as shown here: 


Observe that the last shape pictured does not contribute to the sum because it has more 
than N parts. An antisymmetric polynomial indexed by this shape would appear for N > 7. 
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10.9 The Pieri Rule for e;, 


Next we derive Pieri Rules for calculating @)45(wyex and ay+5(y)hz. Our starting point is 
the following expression for the elementary symmetric polynomial e,: 


eg (@1,---,2N) = S- ee 
SC{1,2,...,.N} JES 
|S|=k 
Let S = {j1,...,j¢} be a fixed k-element subset of {1,2,...,N}. Then Hes Lj 
Lj, Ljy*** Lj, iS a typical term in the polynomial e;. On the other hand, a typical term 
iN @)+5(~) Corresponds to a signed weighted abacus v. Let us investigate what happens to 
the abacus when we multiply such a term by 2j, ---%;,. 

Since the power of x; indicates which position bead j occupies, multiplication by 
“j,°++x;, Should cause each of the beads labeled j1,...,j% to move one position to the 
right. We execute this action by scanning the positions of v from right to left. Whenever we 
see a bead labeled 7 for some 7 € S, we move this bead one step to the right, thus multi- 
plying the weight by x;. Bead collisions may occur, which will lead to object cancellations 
in the proof below. In the case where no bead collisions happen, we obtain a new abacus 
v* € a,46(n)- The beads on this abacus occur in the same order as on v, so w(v*) = w(v) 
and sgn(v*) = sgn(v). Recalling that the parts of \ (respectively v) count the number of 
bead moves needed to justify the beads in v (respectively v*), it follows that v € Pary isa 
partition obtained from A € Pary by adding 1 to k distinct parts of X. This means that the 
skew shape v/A is a vertical strip of size k (see Definition 9.57). 


10.41. Example. Let N = 6 and X = (3,3,2,2). Let v be the abacus in LAbc(A + 6(6)) 
shown here: 


0123 4 5 67 8 9 


1)-e—e-(3)-(2)-0©-(4)(6)-e 


Suppose k = 3 and S = {1,2,3}. We move bead 2, then bead 3, then bead 1 one step right 
on the abacus. No bead collision occurs, and we get the following abacus: 


0123 4 5 67 8 9 


5)-e-(1)-0-©-3)2)(4)6) 


This abacus lies in LAbc(v + 6(6)), where v = (3,3,3,3,1). Drawing the diagrams, we see 
that v arises from \ by adding a vertical 3-strip: 


Suppose instead that S = {1,2,6}. This time we obtain the abacus 
0 123 45 6 7 8 9 


5 )-@(1)-0{(3)-e-(2)(4)-# 6) 
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which is in LAbe((4, 3, 3, 2, 1) + 6(6)). Now the partition diagrams look like this: 


However, suppose we start with the subset S = {3,5,6}. When we move bead 6, then bead 
3, then bead 5 on the abacus v, bead 3 collides with bead 2. We can match the pair (v, S$’) 
to (v’, 5”), where S’ = {2,5,6} and v’ is this abacus: 


012 3 4 5 6 7 8 9 


1)-e—e-(2)-3)-e-(4)(6)}-e 


Observe that sgn(v’) = — sgn(v) and wt(v)132%5%6 = wt(v’)a2uv5x%6. This example illustrates 
the cancellation idea used in the proof below. 


10.42. The Antisymmetric Pieri Rule for e;. For all \ € Pary and all k > 1, 


@46(w)(01,-..,2N )ex(#1,-..,2N) = a 4g46(N)(%1,---,2N)- 
BEeParn: 
B/X is a vertical k-strip 
Proof. Let X be the set of pairs (v,S) where v € LAbc(A+6(N)) and S is a k-element subset 
of {1,2,..., N}. Letting sgn(v, S) = sgn(v) and wt(v, $) = wt(v) [],-¢2;, the remarks at 
the start of this section show that 


jes 


4+46(N)Ck = S- sgn(z) wt(z). 
4 

Define an involution I: X > X as follows. Given (v, 5’) € X, scan the abacus v from right 
to left and move each bead in S one step to the right. If this can be done with no bead 
collisions, we obtain an abacus v* counted by the sum on the right side of the theorem, such 
that sgn(v) = sgn(v*) and wt(v, S) = wt(v*). In this case, (v, S) is a fixed point of J, and 
the bead motion rule defines a sign-preserving, weight-preserving bijection between these 
fixed points and the abaci counted by the right side of the theorem. 

Now suppose a bead collision does occur. Then for some 7 € S and some k ¢ S, bead 
k lies one step to the right of bead j in v. Take 7 to be the rightmost bead in v for which 
this is true. Let I(v, S$) = (v’,S”) where v’ is v with beads j and & interchanged, and S’ 
is S with 7 removed and k added. One immediately verifies that sgn(v’, S’) = —sgn(v,S), 
wt(v, S) = wt(v’, S’), and I(v’, S”) = (v, $'). So I cancels all objects in which a bead collision 
occurs. O 


10.10 The Pieri Rule for h; 


In the last section, we computed a)+5;w)ex by using a k-element subset of {1,2,...,N} to 
move beads on a labeled abacus. Now we compute a)45()hx by moving beads based on a 
k-element multiset. This approach is motivated by the formula 


hg(a1,.--,0nN) = > EZ 


k-element multisets j7€M 
M of {1,...,N} 
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where the factor x; is repeated as many times as j appears in M. 

Suppose v is an abacus counted by a+ 5), and a]"'---ah% is a typical term in hz (so 
each m; > 0 and m; +---+ my = k). Scan the beads in v from left to right. Whenever 
we encounter a bead labeled 7, we move it right, one step at a time, for a total of m; 
positions. Bead collisions may occur and will lead to object cancellations later. If no collision 
occurs, we have a new abacus v* € LAbc(v + d(.N)) with the same sign as v and weight 
wt(u*) = wt(v)ay"!--- ah’. It follows from the bead motion rule that the shape v arises 
from by adding a horizontal k-strip to \ (see Definition 9.57). Conversely, any abacus 
indexed by such a shape can be constructed from an abacus indexed by \ by an appropriate 
choice of the bead multiset. These ideas are illustrated in the following example, which 
should be compared to the example in the preceding section. 


10.43. Example. Let N = 6 and \ = (3,3,2,2). Let uv be the abacus in LAbc(A + 6(6)) 
shown here: 


0 12 3 4 5 6 7 8 9 


1)-e—e-(3)-(2)-0-(4)(6)-e 


Let M be the multiset [1, 1,2]. It is possible to move bead 1 to the right twice in a row, and 
then move bead 2 once, without causing any collisions. This produces the following abacus: 


012 3 4 5 6 7 8 9 


5 )}-e—e{1)(3)-0(2)(4)(6)-0 


This abacus lies in LAbc(v + 6(6)), where v = (3,3,3,2,2) arises from A by adding a 
horizontal 3-strip: 


If instead we take M = [1, 2,6], we move bead 1, then bead 2, then bead 6, leading to this 
abacus in LAbc((4, 3, 3, 2, 1) + 6(6)): 
2 


0 41 3 4 5 
5 )-e-(1)-0-3)-#(2)(4)-#6) 


On the other hand, suppose we try to modify v using the multiset M = [1,2,3]. When 
scanning v from left to right, bead 3 moves before bead 2 and collides with bead 2. We 
match the pair (v,M/) with the pair (v’, ’), where M’ = [1,2,2] and v’ is the following 
abacus: 


Observe that sgn(v’) = —sgn(v) and wt(v)aizer3 = vir$a3aria8ar8 = wt(v')r123. This 
example illustrates the cancellation idea used in the proof below. 
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10.44. The Antisymmetric Pieri Rule for h;. For all \ © Pary and all k > 1, 


@48(N)(%1,---,2N)hg(21,.-.,0N) = s @g+8(w)(%1,++-, EN): 


BeParn: 
B/X is a horizontal k-strip 


Proof. Let X be the set of pairs (v,M) where v € LAbc(A + 6(N)) and M = 
(1m2™2...N™N] is a k-element multiset. Defining sgn(v, MM) = sgn(v) and wt(v,M) = 


wt(v) ea z;”, we have 


a+46(N) hk = > sgn(z) wt(z). 

zEX 
Define an involution I: X —> X as follows. Given (v,M) € X, scan the abacus v from 
left to right. When bead 7 is encountered in the scan, move it m; steps right, one step 
at a time. If all bead motions are completed with no bead collisions, we obtain an abacus 
v* counted by the sum on the right side of the theorem, such that sgn(v) = sgn(v*) and 
wt(v, M) = wt(v*). In this case, (v,M) is a fixed point of J, and the bead motion rule 
defines a sign-preserving, weight-preserving bijection between these fixed points and the 
abaci counted by the right side of the theorem. 

Now consider the case where a bead collision does occur. Suppose the first collision 
occurs when bead j hits a bead k that is located p < mj; positions to the right of bead j’s 
initial position. Define I(v, MM) = (v’, M’), where v’ is v with beads j and k interchanged, 
and M' is obtained from M by letting 7 occur m; — p > 0 times in M’, letting k occur 
mrp +p times in M’, and leaving all other multiplicities the same. One may check that 
sgn(v’, M’) = —sgn(v, M), wt(v, M) = wt(v’, MW"), and I(v', M’) = (v, M). So I cancels all 
objects in which a bead collision occurs. O 


Loan 


10.11 Antisymmetric Polynomials and Schur Polynomials 


The Pieri Rule for computing @)+5(wyhx closely resembles the rule for computing s,hx 
from §9.11. This resemblance leads to an algebraic proof of a formula expressing Schur 
polynomials as quotients of antisymmetric polynomials. 


10.45. Theorem: Schur Polynomials and Antisymmetric Polynomials. For all 
A € Pary, 


Q)+65(N) (x1, sae ,tN) det[a¥t 4 1 <5 5cN 
8y(a1,..., 2) SS ee 
as(n)(@1,---, fn) det[x; ]i<ij<n 
Proof. In Theorem 9.64, we iterated the Pieri Rule 
$y(@1,...,0n)he(21,..-,0nN) = Pe $3(%1,...,2N) 


BeParn: 
B/v is a horizontal k-strip 


to deduce the formula 


Ay(t,.-.,0n) = > Ky n8)(1,...,@n) for all pw € Pary. (10.3) 
AEParn 


Recall that this derivation used semistandard tableaux to encode the sequence of horizontal 
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strips that were added to go from the empty shape to the shape A. Now, precisely the same 
idea can be applied to iterate the antisymmetric Pieri Rule 


Qy45(N)(£1,---,€n)he(21,...,2N) = ) 4946(N)(Z1;--.,2N)- 
BeParn: 
B/v is a horizontal k-strip 


If we start with v = (0) and multiply successively by hy, ,hy.,.-., we obtain the formula 
a046(N) (#1, ois ,tn)hy, (x1, are ,UN) = SS K) p@a+ac) (£1, eras tN) for all w € Pary. 
AEParn 
(10.4) 


Now restrict attention to partitions A,j. € Pary(m). As in Theorem 9.66, we can write 
the equations in (10.3) in the form H = K"S, where H = (h, : w € Parn(m)) and 
S = (s, :A € Pary(m)) are column vectors, and K® is the transpose of the Kostka matrix. 
Letting A = (a@)45(N)/@5() : A € Parn(m)), we can similarly write the equations in (10.4) 
in the form H = K"A. Finally, since the Kostka matrix is invertible (being unitriangular), 
we can conclude that 

A=(K")“HheS. 


Equating entries of these vectors gives the result. O 


A combinatorial proof of the identity a@)45() = 8@5(N) is given in §10.13. 


10.12 Rim-Hook Tableaux 


The connection between Schur polynomials and antisymmetric polynomials lets us deduce 
the following Pieri Rule for calculating the product s)px. 


10.46. The Symmetric Pieri Rule for p,. For all \ € Pary and all k > 1, 
8 (@1,.--,0n)pe(@1,.--,0N) = > sgn(R)sg(r1,...,0N). 
BEeParn: 
B/X is a k-ribbon R 
Proof. Start with the identity 
446(N)(21,---,2N)PE(£1,---,2N) = y sen(R)ag45(n)(1,..-,2N) 
BeParn: 
B/X is a k-ribbon R 


(proved in Theorem 10.39), divide both sides by a5;y), and use Theorem 10.45. oO 


10.47. Example. Suppose we multiply s(9) = 1 by pa using the Pieri Rule. The result is 
a signed sum of Schur polynomials indexed by 4-ribbons: 


Pa = 8(0)P4 = 8(4) — $(3,1) + $(2,1,1) — §(1,1,1,1): 


To expand py4,3) into Schur polynomials, first multiply both sides of the previous equation 
by ps: 
P(4,3) = P4P3 = 8(4)P3 — §(3,1)P3 1 §(2,1,1)P3 — $(1,1,1,1)P3- 
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Now use the Pieri Rule on each term on the right side. This leads to the diagrams shown 
in Figure 10.3. Taking signs into account, we obtain 


P(4,3) =  §(7) + 84,3) — 8(4,2,1) 7 $(4,1,1,1) 
—8(6,1) + 8(3,2,2) — $(3,1,1,1,1) 
+8(5,1,1) — 8(3,3,1) + §(2,1,1,1,1,1) 


—$(4,1,1,1) + $(3,2,1,1) — §(2,2,2,1) — $(4,1,1,1,1,1,1)- 


Here we are assuming N (the number of variables) is at least 7. 


Shapes for s(4)p3: 


BERR Be 


Shapes for —s(3 1)ps: 


HLS 


Shapes for s(2,1,1)p3: 


Shapes for —$(1/1,1,1)P3: 


FIGURE 10.3 
Adding k-ribbons to compute sp x. 


Just as we used semistandard tableaux to encode successive additions of horizontal 
strips, we can use the following notion of a rim-hook tableau to encode successive additions 
of signed ribbons. 


10.48. Definition: Rim-Hook Tableaux. Given a partition \ and a sequence a € Z&o, 
a rim-hook tableau of shape and content a is a sequence T of partitions 


(0) =v° ¢ vicrwc---Cyw=x 
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such that v’/v’~! is an a;-ribbon for all i between 1 and s. We represent this tableau 
pictorially by drawing the diagram of A and entering the number 7 in each cell of the ribbon 
v'/v'!, The sign of the rim-hook tableau T is the product of the signs of the ribbons 
v'/v*—, (Recall that the sign of a ribbon occupying r rows is (—1)"~!.) Let RHT(A, a) be 
the set of all rim-hook tableaux of shape \ and content a. Finally, define the integer 


ee » sen(T). 


TERHT(A,a) 


Rim-hook tableaux of skew shape \/,: are defined analogously; now we require that v° = yp, 
so that the cells of 4 do not get filled with ribbons. The set RHT(A/, a) and the integer 
x2/ ¥ are defined as above. 

10.49. Example. Suppose we expand the product p4ap2p3p; into a sum of Schur polyno- 
mials. We can do this by applying the Pieri Rule four times, starting with the empty shape. 
Each application of the Pieri Rule adds a new border ribbon to the shape. The lengths of 
the ribbons are given by the content vector a = (4,2,3,1). Here is one possible sequence of 
ribbon additions: 


oo | | SBOE | | tt 
te kd eS ee 
L* | | [| | | | | ls 


This sequence of shapes defines a rim-hook tableau 
T= ((0), (2, 1, i), (2, 2, 2), (4, 3, 2), (4, 3, 3)), 


which can be visualized using the following diagram: 


[11/3] 3] 
T ={1/2[3] 
1] 2] 4] 


Note that the ribbons we added have signs +1, —1, —1, and +1, so sgn(T) = +1. This par- 
ticular choice of ribbon additions therefore produces a term +5(4,3,3) in the Schur expansion 
of P(4,2,3,1)- 

Now suppose we want to know the coefficient of s(4,3.3) in the Schur expansion of 
pap2p3p1. The preceding discussion shows that we obtain a term -+s,4,3.3) for every rim- 
hook tableau of shape (4,3,3) and content (4,2,3,1), where the sign of the term is the 
sign of the tableau. To find the required coefficient, we must enumerate all the objects in 
RHT((4, 3, 3), (4, 2,3,1)). In addition to the rim-hook tableau T displayed above, we find 
the following rim-hook tableaux: 


fifif3]4] = [afifat2})  fafafay4} fafa fafa] 
1] 2[3) EREIE} 1] 2[ 21 PABIE} 
1] 2[3) 1] 3] 4) 3]3]3 | 2] 3] 4) 


The signs of the new tableaux are —1, —1, —1, and +1, so the coefficient is +1—1—1—1+1= 
—1. 


The calculations in the preceding example generalize to give the following rule for ex- 
panding power-sum polynomials into sums of Schur polynomials. 


10.50. Theorem: Schur Expansion of Power-Sum Polynomials. For all a € Zo 
and all N > 1, 
Palt1,...,0N) = .2 x28)(21,...,2y). 


A€Parn 
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Proof. By iteration of the Pieri Rule, the coefficient of 5) in pa = S(0)Pa,*** Pa, is the 
signed sum of all sequences of partitions 


0O=r9 Cricrc.:.-cY=xX 


such that the skew shape v’/v*~+ is an a;-ribbon for all i. By the definition of rim-hook 
tableaux, this sum is precisely y2. O 


10.51. Theorem: Symmetry of \>. If a and £ are compositions with sort(a) = sort(), 
then y2 = xa for all partitions A. 


Proof. The hypothesis implies that the sequence a = (a1,Q2,...) can be rearranged to 
the sequence 8 = (3), 62,...). It follows from this that pa = [],pPa, = [1], Ps, = Pa, 
since multiplication of polynomials is commutative. Let k = 57, a; and take N > k. Two 
applications of the previous theorem give 


>) Mase = temps = DS xeee 


AE Par(k) A€Par(k) 


By linear independence of the Schur polynomials {s)(a1,...,@x) : A € Par(k)}, we conclude 
that x2 = xe for all A. O 


10.52. Remark. These results extend to skew shapes as follows. If w is a partition, then 


Sy(€1,.--,0nN)Pa(@1,...,2n) = ) x2/"8y(a1,...,0N). 
A€Parn: 
CA 


/b 


Furthermore, if sort(a@) = sort() then x2 — ei The proof is the same as before, 


replacing (0) by throughout. 


We have just seen how to expand power-sum symmetric polynomials into sums of Schur 
polynomials. Conversely, it is possible to express Schur polynomials in terms of the p,. We 
can use the Hall scalar product from §9.26 to derive this expansion from the previous one. 


10.53. Theorem: Power-Sum Expansion of Schur Polynomials. For all N > k and 


 € Par(k), 
Xie 
8\(@1,---,0N) = S- —py(®1,-.-,EN). 
uePar(k) ie 


Proof. For all « € Par(k), we know that p, = >> 
partition A € Par(k), 


vePar(k) XuSv- Therefore, for a given 


(Pus 8) = » Nis (Spy, Sy) _ xn 
v€Par(k) 


since the Schur polynomials are orthonormal relative to the Hall scalar product. Now, since 
the p, form a basis of AR we know there exist scalars c, € R with s, = Lae cypp. To find 
a given coefficient c,,, we compute 


xn = (Pu, 8x) = a Div, tin), = Cyeiys 
Vv 


where the last equality follows by definition of the Hall scalar product. We see that c, = 
News as needed. oO 
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10.13 Abaci and Tableaux 


This section contains a combinatorial proof of the identity 
as(ny(@1,---,2Nn)8(21,..-,2N) = @46(n)(L1,---, EN), 


which we proved algebraically in §10.11. 
Let X be the set of pairs (v, 7’), where v is a justified labeled abacus with N beads and 
T is a semistandard tableau using letters in {1,2,...,N}. We need to use the following 
non-standard total ordering on this alphabet that depends on v: define i <, 7 iff bead 7 
is to the right of bead j on the abacus v. Equivalently, we can describe the total order by 
writing 
UN-1 <v UN-2 <u *** <v U1 <v V0- 


Here are two examples of objects in X when N = 7 and A = (7,7, 5,3, 2): 


(v', - = 


Note that we can pass from the first tableau (which is semistandard under the usual ordering 
of integers) to the second tableau (which is semistandard relative to one of the non-standard 
orderings) by applying the permutation sending 7 to 2, 6 to 4, etc., to each entry in the first 
tableau. It follows that the generating function for the set SSYT (A) relative to one of the 
non-standard orderings <, can be obtained from the generating function for semistandard 
tableaux relative to < (namely s)(a1,...,2,)) by applying the permutation sending x7 to 
x2, 6 to £4, etc. Since Schur polynomials are symmetric, the answer is still s,(a1,...,@N). 
By the Product Rule for Weighted Sets, we conclude that 


ye sgn(v) wt(v)x" = 4§(N) (i, eck tn )ax(Bi, bai »@N). 
(v,T)EX 


On the other hand, let Y = LAbc(A + d(V)) be the set of N-bead labeled abaci with 
beads in positions \ + 6(N). The generating function for the signed weighted set Y is 
@45(N)(@1,---,@n). So it suffices to define a sign-reversing, weight-preserving involution 
I: X — X where the fixed point set of J corresponds bijectively to Y. The main idea is 
that the tableau T encodes a sequence of bead motions on the abacus v. If performing these 
movements causes a bead collision, then (v,7T) will cancel with some other object in X. 
Otherwise, the abacus obtained from v by the bead motions is one of the objects in Y. 

A tableau T specifies bead motions as follows. Define the reading word of T to be the 
word w(T) = wiw2-+:Wn (where n = |A|) obtained by concatenating the rows of T from 
bottom to top. For example, the object (v’,T’) shown above has 


w(T") = 224447755566666663333333. 
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Given (v,T) € X, scan the symbols in w(T’) from right to left. When a symbol 7 is encoun- 
tered, move the bead labeled j in v one step to the right. 

Let us first determine which objects (v,T) have no bead collisions. Suppose v = 
ug..-Un—100.... Let 2 be the last entry in the top row of T,, which is the rightmost letter in 
w(T). We must first move bead 7 one step to the right. This move already causes a collision 
(since v is justified) unless i = vy_ 1. Since vy_y is the smallest letter relative to <, and 
T is semistandard, 7 = vy_ iff all entries in the top row of T are equal to vy_ 1. In this 
situation, we move the rightmost bead vy_ , to the right A; positions with no collisions. 

Now we repeat the argument on the second row of T’. The rightmost entry 7 in this row 
cannot be vy—1 (otherwise we would not have a strict increase in every column). The only 
way to avoid an immediate bead collision is when 7 = vy—g, in which case all entries in the 
second row must equal vy_—g. In this situation, bead vy—2 moves to the right Az positions 
with no collisions. 

Continuing similarly, we see that (v,7’) has no collisions iff for all k, the kth row of 
T consists of A; copies of the kth smallest letter vy_—,. Moving the beads on v according 
to T has the effect of unjustifying v to an abacus v* € Y = LAbc(A + d(N)). Defining 
I(v,T) = (v,T) in this case, we have specified a bijection between the set of fixed points of 
I and Y. For example, 


(v,T) = | 2451763000--- maps to v* = 24005010070063000 - - - 


The map sending (v,7) to v* preserves signs and weights. 

To complete the proof, we describe a cancellation mechanism to pair off objects (v, 7) in 
which bead collisions do occur. Suppose the first bead collision for (v, 7) occurs when some 
bead 7 moves to the right one step and bumps into bead j. Note that i >, 7, and i and j 
must be two adjacent letters in the total ordering >,. Define (v',T’) = I(v,T) as follows. 
We obtain v’ from v by interchanging the adjacent beads i and j, so that sgn(v’) = — sgn(v), 
wt(v’)x; = wt(v)a;, and <,, agrees with <, except that now i <, 7. 

We obtain T’ from T by modifying the occurrences of i and j in w(T’) by a procedure 
similar to the one used in §9.5. By the same reasoning used to determine the fixed points 
of I, we know that the occurrence of 2 in T that caused the bead collision is the rightmost 
entry in some row of T’, say the kth row; furthermore, for 1 < 1 < k, row | consists of 2; 
copies of vy_;. Now i >» vn— x (or this entry of T would not cause a collision), and so 
j >v vn—x. This means that no entry in the first k — 1 rows of T equals i or 7, so these 
rows can be ignored in the following discussion. 

We now describe how to change T into T’. Whenever j occurs directly above 7 in T 
(call these occurrences matched pairs), interchange these two symbols. Some rows of T may 
contain unmatched i’s and j’s, in which a > 0 copies of 7 are followed by b > 0 copies 
of 7. In particular, row k has a > 0 and b > 0, since the i at the end of ne row cannot 
be matched with a j above it. In row k, replace the unmatched symbols j%i? by j@+1i°!. 
Then, in all rows containing unmatched 7’s and j’s (including the new row k), tepiice 
the unmatched symbols j%7” by i?j*. The following assertions can now be checked: T” is 
a semistandard tableau relative to <,; T’ has one fewer i and one more j than T does; 
xl", = x? aj; wt(v',T’) = wt(v,T); sgn(v’, T’) = —sgn(v); the last symbol in row k of T’ 
is an unmatched 7; this unmatched j causes the first bead collision when T’ is used to move 
the beads on v’; and I(v’, T’) = (v,T). 
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10.54. Example. Consider the object 


(v,T) = | 2451763000---, 


Processing the first two rows of 7’, we move bead 3 right seven positions, then move bead 
6 right seven positions with no collisions. But in row 3, the rightmost symbol 7 = 5 causes 
a collision with bead 7 = 1. There are no matched pairs of 5’s and 1’s in this tableau, so 
we first change the 555 in row 3 to 155, and then change this string to 551 to preserve 
semistandardness under the new ordering. We have 


I(v,T) = | 2415763000---, 


If we apply J to this object, bead 1 bumps into bead 5, and we find that I(I(v,T)) = (v,T). 


10.55. Example. Consider the object 


(v,T) = | 2451763000--- , 


\ 


Now the first collision occurs when bead 7 = 7 bumps into bead 7 = 6 because of the 7 at 
the end of the second row of T. The first two 6’s in that row are matched with 7’s below, 
so the unmatched is and js in row 2 form the word 67777. We replace this string first by 
66777, then by 77766. Interchanging the matched 6’s and 7’s leads to 


I(v,T) = | 2451673000: -- , 


10.14 Skew Schur Polynomials 


In the remainder of this chapter, we develop further combinatorial properties of skew Schur 
polynomials. Recall Definition 9.132: for every skew shape /11, 


SA /p(@i,+-,8n) = a x. 


TESSYTN(A/p) 
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Theorem 9.135 states that skew Schur polynomials are symmetric. More precisely, we have 
the following expansion in the monomial basis: 


8y/y(21,---,2n) = S- Ky potty (Biss<e52), 


ve€Parn 


where K,,,, is the number of semistandard tableaux of shape A/ and content v. Our 
current goal is to find combinatorial formulas for the expansion of skew Schur polynomials 
relative to some other bases for the vector space A. We begin by proving an algebraic fact 
involving the Hall scalar product. 


10.57. Theorem: Skew Schur Polynomials and the Hall Scalar Product. Suppose 
d, € Par, k= |A| — |u|, N > |Al, and f € Ak. Then (s)/,,,f) = (a, Suef). 


Proof. We first prove the result for f = h,, where v € Par(k). On one hand, we have the 
expansion 


Sr/p = s Ky /p,p™p: 
p€Par(k) 


Taking the scalar product of both sides with h, gives (Situs ty) = K)yjyv, by Theo- 
rem 9.125. 
On the other hand, the Pieri Rule shows that 


Syhy = >. Ko/uv8p 
p 


(see Theorem 9.136). Taking the scalar product of both sides with s) gives (s), s,hv) = 
K)y/y,v- Thus the result holds for every f in the complete homogeneous basis. 

The general case now follows by linearity: given any f € A‘, write f = 2D Cully OE 
certain real scalars c,. Then compute 


on (say Deve) = Debt 


= ee (8)3.8,hi) = (5. Dette) =(s,,5,f). O 


We can use Theorem 10.57 to expand skew Schur polynomials in terms of power-sum 
symmetric polynomials. 


10.58. Theorem: Power-Sum Expansion of Skew Schur Polynomials. Suppose p C 
\ are partitions with k = |A| — ||. For all N > |Al, 


A/t 
Vv 


Dy (%1,---,2N)- 


8) /p(£1,-+-,2N) = > 


v€Par(k) v 


Proof. We imitate the proof of Theorem 10.53. Start with the expansion 
Ce re 
r 
Now take the scalar product of both sides with a given partition A: 


(8), SuPv) = ae 


462 Combinatorics, Second Edition 


We know the symmetric polynomial s)/, has some expansion in the power-sum basis, say 
$\/p = do» A pv for some a, € R. To find a particular a,, take the scalar product with 
Du/ Ba to get 


ay = (Sujet [2e) = (35.5 SuPv/Zv) = (Sy, SuPy) je = yal fay. O 
We also deduce the effect of the involution w on skew Schur polynomials. 


10.59. Theorem: Action of w on Skew Schur Polynomials. For all partitions uw C A 
and all N > |Al, 
W(Sr/y(€1,---,0N)) = 8 fut (01,...,2N)- 


Proof. We already know that the involution w is a rmg homomorphism and isometry sending 
every Sq to Sq’. For each partition v of size |A| — |u|, we can therefore write: 


(W(Sr/u)s8v) = (w(w(8y/u)),4(Sr)) = (8r/ps 8v/) = (8d; 850") 
=  (w(s,), w(880")) = (8r15 Sy 8v)) = (8y1/p"s Sv) - 


Thus w(s/,,) and sy, have the same expansion in the Schur basis and are therefore 
equal. oO 


DT 


10.15 The Jacobi—Trudi Formulas 


Our next goal is to obtain formulas expressing skew Schur polynomials as determinants 
involving the complete symmetric polynomials h;, or the elementary symmetric polynomials 
ex. To derive these results, we need a new combinatorial construction relating tableaux to 
collections of non-intersecting lattice paths. 

We begin by interpreting hy(a1,...,2,) in terms of lattice paths. Fix an integer a and 
consider the set S of lattice paths from (a,1) to (a+k,N) that take unit steps up (u) and 
east (ec). We can encode a path p in this set by listing the y-coordinates of the successive east 
steps of p. For example, the path eeuueuee corresponds to the sequence 1, 1, 3,4, 4. This gives 
a bijection from S' to the set of weakly increasing sequences 1 < i, < ig < ++: Sin <N. 
Let the weight of the path corresponding to this sequence be 2;, 7, +--+ ;,. Comparing to 
the definition of hy, we see that 


he(@1,-..,0N) = S" wt(p). 


pes 


This formula holds for all integers k (possibly negative), if we use the convention ho = 1 
and hy = 0 for negative k. 

Now let A be a partition with n < N parts, and let uw C A. Let X be the set of fillings of 
the skew shape A/ using letters in {1,2,...,N} such that each row weakly increases. Let 
Y be the set of sequences P = (p1,...,Pn) where p; is a lattice path from (n — 7+ ,4;, 1) to 
(n-i+;,N). Let wt(P) = wt(pi)---wt(pn), so wt(P) keeps track of the y-coordinates 
of all the east steps of the paths in P. As explained above, we can encode each row i of 
a filling U € X as a lattice path p; from (a,1) to (a+ A; — wi, N), where a = n—i+ pu;. 
The function sending U to (p1,...,Dn) is a weight-preserving bijection f : X — Y. Some 
examples are shown in Figure 10.4. 

We say that two lattice paths intersect iff they share a common edge or vertex. Let Y’ 
be the set of P € Y such that no two paths in P intersect. Inspection of Figure 10.4 suggests 
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1/1/1/|2/3 

2}2/3)/4/5 

3|4/4 

515 

1)/2/);2|2)/3 

2}2/3)/4/5 

3/515 

1/4 

1[3/3|5 Be eee ae 

1/2/4 a} * ee belles 

3/5 
le e— e—.-.-. ee 
012345 67 8 

1/3 | 

1|3|3/3 | 

2/2/4 oS 

1/2 TPs 
le oe. Nege@are saesiegs | eee aed 


012 4567 8 


FIGURE 10.4 
Encoding fillings of a skew shape by sequences of lattice paths. 
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that f restricts to a weight-preserving bijection from SSYT\(A/j) to Y’. To see why this 
holds, consider consecutive entries U(i,j) = a and U(i+ 1,7) = b in column 7 of a filling 
U € X.In f(U), path p; has an east step from (n—i+;+(j —pi)—1, a) = (n+j—-i-1,a) 
to (n+ 7 —%,a), whereas p;+1 has an east step from (n + 7 — i — 2,5) to (n+ jy —i-—1,0). 
Suppose a > b. Since the beginning of p; goes from (n—i+ p;,1) to (n+ 7 —i-—1,a), there 
is no way for p;+1 (which starts to the left of p;) to reach the point (n+ 7 —i—1,b) without 
intersecting p;. Conversely, suppose two paths intersect. Then there must exist 7 such that 
p; and p;+1 intersect. The earliest intersection of these paths must occur when p;+1 reaches 
p;, by taking an east step ending at some point (n + 7 — i — 1,6). One may now check that 
there must exist an east step in p; starting at (n+ j —i—1,a) for some a > b, which shows 
that U(i,j) > U(i+1,3) in the filling U. 

Now we are ready to prove the Jacobi~Trudi Formulas. The idea is to introduce a large 
collection of signed weighted sequences of paths that model the terms of a determinant. 
Cancellations will remove all sequences of intersecting paths, leaving only the objects in Y’, 
which correspond to semistandard skew tableaux. 


10.60. The First Jacobi—Trudi Formula. Suppose is a partition with n < N parts, 
and uz © A. Then 


Sx fph2is aa ,mN) — det[hy,—y,+5-1(@1, San »2Nn )|1<i,j<n- 


Proof. By the definition of a determinant (see Definition 12.40), the right side of the formula 
to be proved can be written 


x sgn(w) II Diino Paths cia ,tN). 


WESn i=1 


This is the generating function for the following signed weighted set. Let Z be the set of 
sequences (w,p1,.--,Pn) such that w € S, and p; is a path from (n — w(t) + wi), 1) to 
(A; +n —i, N). The weight of such a sequence is [[;_, wt(p;), and the sign is sgn(w). 

The following involution cancels all objects (w,p1,...,Pn) in which two or more paths 
intersect. Among all lattice points (u,v) where two paths intersect, choose the one for which 
u is minimized; if there are ties, choose the point that minimizes v. Let i < 7 be the two least 
indices such that p; and p,; pass through (u,v). Write p; = gr where q (respectively r) is 
the part of p; before (respectively after) the point (u,v). Similarly write p; = st. Now, pair 
the given object with the object (w’,pj,...,p),) where w’ = wo (i,j), p; = sr, p; = qt, and 
py, = pr for all k £1,7. (Thus we have switched the initial segments of the two intersecting 
paths.) It can be checked that the new object lies in Z and has the same weight and opposite 
sign as the original object. Moreover, applying the map a second time restores the original 
object, so we have an involution. Some examples are shown in Figure 10.5. (Note that path 
p; goes from the w(z)th point from the right on the line y = 1 to the ith point from the 
right on the line y = N.) 

Let us consider an object (w,p1,.--,Pn) in Z that is not canceled by the involution. No 
two paths in this object can intersect. We claim that this forces w = id. For otherwise, there 
would exist ¢ < 7 with w(i) > w(j). But then p; would start to the left of p; on the line 
y = 1 and end to the right of p; on the line y = N, which would force p; and p; to intersect. 
So w = id. Erasing w maps the fixed points in Z bijectively to the set Y’, which in turn 
maps bijectively to SSYT y(A/), as shown in the discussion preceding the theorem. oO 


10.61. The Second Jacobi—Trudi Formula. Suppose 4 is a partition with 4; =n < N, 
and pw C A. Then 


Sy jal Bis sae ,UN) = detley, 4 +5-i(@1, see »tn )|1<i,j<n- 
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Fe aa cae ee (aie i cade 
faa anee aa 
aac) ae ae a a ee <> 
a least eee a 
ee a os : a a — ae : 


0123 45 6 7 8 
w = 1243 w = 1234 


5. ee Beyer e-8 seen es, Se, ce Ge 
> : i i : : —_ : ary 
ee ee es ee ee eee ee en ee een raeeee 

: : : : 4 : 


fe ee ee ee a A ee ae ee ea ee 
01234567 01234567 


w = 3124 w = 4123 


FIGURE 10.5 
Cancellation mechanism for intersecting paths. 
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Proof. For all fi; € Aw, we have w(det[fi;]) = det[w(f;;)]. This follows from the defining 
formula for determinants and the fact that w is a ring homomorphism. So, we obtain the 
second Jacobi-Trudi Formula by applying w to both sides of the first Jacobi-Trudi Formula 


S)!/p! = det[hy,—y4 +5-i)- O 


10.62. Example. According to the first Jacobi-Trudi Formula, 


hg ha hs 
8(3,3,1) = det | ho hg ha | =hegi3.ty + ho) — Aya,3) — hya,2,1): 
O 1 hy 
Note that the main diagonal entries in the formula for s, are hy,,h),,...,h,, and the 


subscripts increase by 1 (respectively decrease by 1) as we read to the right (respectively 
left) along each row. Similarly, 


€3 €4 &5 
S(3,3,1) = det . €2 €3 | = €(3,2,2) + €(4,3) + €(5,1,1) — €(5,2) — €(3,3,1) — €(4,2,1): 
€1 €2 


Here is a typical expansion of a skew Schur polynomial: 


he a. We 
8(5,5,3)/(3,2,0) = det | hi hs he | =hy3.3,2) + hc7,1) — Aya,3,1) — h¢6,2): 
0 1 he 


10.16 The Inverse Kostka Matrix 


In Chapter 9, the Kostka matrix played a prominent role in relating the Schur basis of Ay 
to several other bases. More specifically, we proved the formulas 


$s, = ; Ky pM, hi= ; Ky Sd; en = ; Ky Sy" 
L Xr Xr 


where all symmetric polynomials have N variables and all summations extend over Pary. 
Letting K = Ky be the matrix of Kostka numbers with rows and columns indexed by 
elements of Pary, and letting s, m, and e be column vectors with entries s,,, m,, and e,, 
these relations can also be written 


s= Km, h= K"s, e = K"w(s). 


We know that the Kostka matrix is invertible (being unitriangular). Let K) ,, be the 
entry in row A and column p of the inverse of the Kostka matrix. Inverting the relations 
above, we see that 


/ / / 
my = y BK) Sp, 3, = y Ky pha, Sy = y KY pea. 
Lb Xr Xr 


Observe that the determinant formulas in the previous section, which express Schur poly- 
nomials in terms of complete homogeneous symmetric polynomials, give algebraic interpre- 
tations for the coefficients K d, yw in this section, we derive combinatorial interpretations for 
these coefficients. To do this, we need the concept of a special rim-hook tableau. 
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10.63. Definition: Special Rim-hook Tableaux. For A, © Pary, a special rim-hook 
tableau of shape and type X is a rim-hook tableau S of shape yz and content @ such that 
sort(@) = A and every nonzero rim-hook in S contains a cell in the leftmost column of the 
diagram of yu. The sign of such a tableau is defined as in Definition 10.48. Let SRHT(u, A) 
be the set of special rim-hook tableaux of shape py and type A. 


10.64. Theorem: Combinatorial Interpretation of the Inverse Kostka Matrix. 
For all A, uw € Pary, 


Kia a > sen(S). 


SESRHT(,.) 


Proof. We give a combinatorial proof of the identity 


Gg(ny( Zip», 8N) Ma iy. ..92N) = x 2. sen(S)a,45(N)(@1,---,2N)- 
weParn SECSRHT(p,A) 


Once this is done, the theorem follows by dividing both sides by aj;y) and comparing the 
resulting identity to the known expansion m, = >> ne: Xe Sn, Which is the unique way of 
writing m, as a linear combination of Schur symmetric polynomials. 

To prove the identity, we study a combinatorial interpretation of the product a5(jym) 
involving abaci. Each term in the polynomial as,y) is modeled by a justified abacus con- 
taining N beads labeled w(NV),...,w(1) in positions 0,...,.N —1 (respectively). Given such 


an abacus, we can view m)(a1,...,2%N) as the sum of all distinct monomials My a7 
such that the exponent sequence (e(1),...,e(V)) is a rearrangement of (A,,..., An). Here 


and below, we view elements of Pary as partitions with exactly N parts, some of which 
may be zero. The multiplication of as;,7) by one of these monomials can be implemented 
on the abacus as follows. Imagine moving the N justified beads from their current runner 
to a new, initially empty runner, by moving each bead w(i) from position N — i on the 
old runner to position N — i+ e(7) on the new runner. Call such a transformation of the 
justified abacus a A-move. A given A-move either causes a bead collision on the new runner, 
or else produces a new abacus, which is enumerated by a monomial in a,,45(N)(©1,---, ©) 
for some ps € Pary. 

Consider the situation where a bead collision occurs. Choose 7 minimal such that bead 
w(i) collides with some other bead on the new runner, and then choose 7 minimal such that 
bead w(i) collides with w(j). Create a new object counted by as(yym), by switching beads 
w(i) and w(j) on the old abacus, and switching e(i) and e(j) in the exponent vector. This 
defines a sign-reversing, weight-preserving involution that cancels all objects in which bead 
collisions occur. 

To complete the proof, we must find a sign-preserving, weight-preserving bijection ¢ 
from the set X of uncanceled objects counted by as(vym) to the signed weighted set 


LJ SRHT(u, A) x LAbe(u + 6(N)). 


uweParn 


For this purpose, we fix 44 € Pary and consider the ways in which a justified abacus with N 
beads can be transformed into an abacus in LAbc(j+ 6(V)) by means of a \-move. Let us 
temporarily ignore bead labels and signs, concentrating at first only on the positions of the 
N beads. The positions of the N beads on the old runner are the entries in the sequence 
O(N) = (N —1,N — 2,...,2,1,0). A A-move adds some rearrangement of the sequence 
A = (A1,.--,ANn) to the sequence 6(N). We obtain an abacus in LAbc(w+ d(V)) iff the sum 
of these sequences is some rearrangement of the sequence 


u+0(N) = (41 +N —-1,p2+N—-2,...,un +N—N). 
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FIGURE 10.6 
A special rim-hook tableau. 


We now show that the rearrangements of \ that produce abaci in LAbce(u+d(V)) can be 
encoded by special rim-hook tableaux of shape and type 4. The proof uses induction on 
N. Let us first illustrate the idea of the proof by considering an example. Take N = 5, uw = 
(7,5,4,4,2), and A = (8, 7,6, 1,0). We seek rearrangements of the vector (8, 7,6, 1,0) which, 
when added to the vector (4, 3,2, 1,0), produce a rearrangement of w+6(N) = (11,8,6,5, 2). 
In this example, the only solution turns out to be (1,8, 0, 7,6)+(4, 3, 2, 1,0) = (5, 11, 2, 8,6). 
We can visualize this solution using the special rim-hook tableau in Figure 10.6, in which 
the rim-hooks (from top to bottom) have lengths (1,8,0,7,6). If we start with a labeled 
justified abacus 54321000--- and perform a \-move using the rearrangement (1,8, 0,7, 6), 
we obtain the abacus 003001504002000--- € LAbc(11,8,6,5,2). The sign of this abacus, 
namely sgn(24513) = —1, differs from the sign of the original abacus, namely sgn(12345) = 
+1, by a factor of (—1)° = sgn(S). A similar remark holds if the original abacus had involved 
some other permutation of the five labels. 

With this example in mind, we return to the general proof. We are seeking permutations 
ji-+:jn and k,---ky satisfying the system of equations 


G5. = jet N= ky 
1+Ajy-1 = Pky +N — kn-1 (10.5) 
N-1+Aj;, = bbe, + N — ky. 


In particular, to satisfy the first equation, we need an index j = jy and an index k = ky 
such that A; = uu, + N —k. If such an index exists, we encode it by drawing the unique 
border ribbon of length A; starting in the leftmost cell of row N of ys. By choice of j and k, 
this border ribbon must end in the rightmost cell of row k of js. In terms of the abaci, the 
A-move encoded by j1--:jj moves the bead in position 0 on the old runner (the Nth bead 
from the right) to position ju, + N —k on the new runner (which becomes the kth bead 
from the right). Thus this bead moves past N — & other beads during the \-move, which 
causes a sign change of (—1)‘~* for any choice of labels. But N — k is precisely the spin of 
the border ribbon we just drew. 

To finish solving system (10.5), let \* be the partition obtained by dropping one part A; 
from A, and let p* be the partition in Pary_, obtained by erasing the cells of ss occupied by 
the ribbon that starts in row N. Suppose we ignore the first equation in the system (10.5) 
and subtract 1 from both sides of the remaining N — 1 equations. One may check that 
the resulting system of N — 1 equations is precisely the system we must solve to change a 
justified abacus to an abacus in LAbc(u* + 6(N — 1)) by means of a A*-move. 

For instance, in the example considered earlier, after we move a bead from position 
0 to position 6 (accounting for the lowest rim-hook in the displayed tableau), we have 
\* = (8,7,1,0) and u* = (7,5,3,1). Having moved one bead, we are left with the task 
of moving beads from positions (4,3,2,1) = (1,1,1,1) + 6(4) to positions (11,8,5,2) = 
(1,1,1,1) + y* + 6(4) using the moves in A* = (8,7, 1,0). 
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By induction on N, the solutions of the reduced system are encoded by special rim-hook 
tableaux S* of shape y* and type A*; and furthermore, the net sign change going from the 
old abacus to the new abacus (disregarding the bead originally in position 0) is sgn(S*). It 
follows that all solutions of the original system are encoded by special rim-hook tableaux S 
of shape yu and type A; and furthermore, the net sign change going from the old abacus to 
the new abacus (taking all beads into account) is sgn(S). 

The preceding discussion contains an implicit recursive definition of the required bi- 
jection ¢. More explicitly, suppose z = (w(N)---w(1)000--- ,e(V)---e(1)) € X is an 
uncanceled object counted by as(yymy. Then ¢(z) = (S,v) where v € LAbc(u + d(N)) 
is obtained from the first component of z by moving bead w(i) right e(i) positions for all 
i, and S' is the unique special rim-hook tableau (of shape jz determined by v) that has a 
rim-hook of length e(i) starting in the leftmost cell of row i of the diagram. The preceding 
arguments show that ¢ preserves signs and weights. To compute ¢~1(S,v), it suffices to 
note that the sequence (e(1),...,e(V)) is the content of the rim-hook tableau S. Knowl- 
edge of this sequence allows us to reverse the \-move and recover w(N)---w(1). Thus, ¢ is 
a bijection. O 


10.65. Remark. An alternate approach to the theorem is to define 


Ky,=  >2 — sgn(S) 


SESRHT(,d) 


and then give a combinatorial proof of the matrix identity KK’ = I (see Exercise 10-55). 
Since K is known to be invertible, it follows that K’ must be the (two-sided) matrix inverse 
of K. 


10.17 Schur Expansion of Skew Schur Polynomials 


We now consider the expansion of skew Schur polynomials as linear combinations of ordinary 
Schur polynomials. Since the ordinary Schur polynomials are a basis of Ay and the skew 
Schur polynomials are in this vector space, we know there exist unique scalars ey uw © Rsuch 
that 


Sijoltine- tu) = YC haa (tipsstn), (10.6) 
bb 


where it suffices to sum over partitions pu of size |A/v|. The scalars c). , are called Littlewood— 
Richardson coefficients. The following result shows that these coefficients are all nonnegative 
integers. Recall that, for a semistandard tableau T of any shape, the word of T is obtained 
by concatenating the rows of T from bottom to top. 


10.66. The Littlewood—Richardson Rule for Skew Schur Polynomials. For all 
partitions A, 1, Vv, e. , is the number of semistandard tableaux T’ of shape A /v and content 
j such that every suffix of the word of T has partition content. In other words, writing 
w(T) = wiw2-+- Wr, we require that for all & between 1 and n and all i > 1, the number of 


a’s in the suffix wpwe41-++ Wn equals or exceeds the number of 7+ 1’s in this suffix. 


Proof. Multiplying both sides of (10.6) by a,j), it suffices to prove the identity 


a5(n) (21, ee ,tn)8/v (21, 145 0N) = y CP 1. Ou+5(N) (Pijaesgciny)- 
be 
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The idea is to generalize the proof of the special case v = (0) given in §10.13. We model the 
left side of the identity by the set X of pairs (v,T), where v is a justified N-bead labeled 
abacus and T is a semistandard tableau of shape A/v using the alphabet {1,2,...,N} 
ordered by <,. Since skew Schur polynomials are symmetric, the generating function for 
the signed weighted set X is a5(y)8y/v- 

We now define an involution I: X + X. Given (v,T) € X, T determines a sequence 
of bead motions on v by reading w(T) from right to left and moving bead & one step to 
the right each time the symbol k is seen. If these bead motions cause a collision, define 
I(v,T) = (v',T’) by the following rules. Suppose the first collision occurs when bead i 
bumps into bead j, where i >, j are adjacent beads in v. Let v’ be v with beads i and 4 
switched, so sgn(v’) = — sgn(v) and wt(v')a; = wt(v)a;. 

Next, we calculate T’ from T as follows. Starting with the word of T, replace each i by 
a left parenthesis, each 7 by a right parenthesis, and ignore all other symbols. Match left 
and right parentheses in the resulting string of parentheses, and ignore these matched pairs 
of parentheses hereafter. The remaining unmatched parentheses must consist of a string 
of a > 0 right parentheses followed by a string of b > 0 left parentheses, since if a left 
parenthesis appeared somewhere to the left of a right parenthesis we could find another 
matched pair of parentheses. 

Note that b > 0, since otherwise bead 7 would never bump into bead j. Indeed, the first 
bead collision occurs when we reach the rightmost unmatched left parenthesis (occurrence 
of i) in the word of T'. Now, change the subword of unmatched parentheses from “)*(°” to 
«)b—1(a+1” | and then convert all left parentheses to j’s and all right parentheses to i’s. One 
may verify that the new word is the word of a tableau T’ € SSYTy\(A/v), relative to the 
ordering <,, because 7 and 7 are adjacent relative to the orderings <, and <,, and the 
status of a given parenthesis symbol in T’ (matched or unmatched) is the same as its status 
in T. See the example following the proof for more discussion of this point. 

Because T’ has one less i than T and T’ has one more j than T, we have wt(T’)a; = 
wt(T')x;. Since we also had wt(v’)a; = wt(v)a;, we see that wt(v',T’) = wt(v,T). Thus I 
is sign-reversing and weight-preserving. Finally, to check that J is an involution, consider 
what happens when we use 7” to move the beads on v’. Bead 7 on v’ moves the same way 
as bead 7 did on v (and vice versa) until we reach the rightmost unmatched parenthesis 
(relative to 7 and j) in w(T’). When this symbol is reached, bead j bumps into bead 7 on 
v’, just as bead 7 bumped into bead j on v. To compute I(v’,T’), we therefore apply the 
parenthesis modification rule to the i’s and j’s appearing in w(Z”). This rule changes the 
unmatched parentheses from “)®~!(+!” back to “)7(°”, which shows that I(v', T’) = (v,T). 
So J is an involution. 

All that remains is to analyze the fixed points of I, which are (by definition) the pairs 
(v, T’) for which no bead collision occurs. Recall that we are starting with a justified abacus v, 
scanning the symbols in w(T’) = w1--- wp, from right to left, and moving the corresponding 
beads on v. Suppose all suffixes of T’ have partition content relative to the ordering <, 
(which means the rightmost bead label occurs at least as often in each suffix as the next 
bead label, and so on). We see from the description of the bead motion that no collision 
occurs. Conversely, if the condition is first violated by some suffix wpwr41-++ Wn, then a 
collision occurs at this point in the scan. Thus the fixed points of J are the pairs (v, 7) such 
that each suffix of T has partition content relative to <,. We map each such fixed point to 
the abacus v* obtained from v by performing the bead motions specified by JT. The abacus 
v* lies in the set LAbce(yz + 6(N)), where pu is the content of T (calculating content relative 
to <,, so 41 is the number of times the rightmost bead moves, etc.). 

We can obtain all the fixed points of I from fixed points of the form (v°,T), where 
v? = (N,N-1,...,1,0,0,...), <,o is the usual ordering on integers, and T is a semistandard 
tableau satisfying the conditions in the theorem statement. We need only permute the 
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bead labels in v? by any w € Sy, and permute the entries of T in the same way. The 
object (v°, 7’) thereby generates N! fixed points, which together contribute one copy of 
4445(N)(1,-..,£N) to the generating function for the fixed points of J. The total number 
of times this term appears in the generating function is the total number of semistandard 
tableaux T of content py satisfying the conditions in the theorem. Since the generating 
function for X must equal the generating function for the fixed point set of J, the proof is 


complete. O 
10.67. Example. To illustrate the parenthesis construction, we compute [(5432100--- ,T), 
where 


The word of T is 12211122221111212222111. The suffix 2222111 of w(T) does not have 
partition content, so this object cancels with some object (5431200--- , 7’). To find T”, first 
convert 1’s to right parentheses and 2’s to left parentheses in w(T): 


12211122221111212222111 
2€OI) 6600) O COCO) 


Now we balance parentheses and mark the remaining unmatched symbols: 


COI) 6000) O COCO) 


* * * 


The substring of unmatched parentheses is )) (. Observe that the rightmost symbol in this 
substring is a left parenthesis corresponding to the first 2 in the offending suffix 2222111, 
and this 2 is the symbol in w(T) causing the first bead collision. As directed by the proof, 
we convert the unmatched parentheis string to ((( and then replace left parentheses by 1’s 
and right parentheses by 2’s: 


* * * 


CO) 00000) O COCO) 
11122111112222121111222 


This new word is w(T"), so finally 


Observe that T’ is a semistandard tableau relative to the ordering 5 >4>3>1> 2. 
In particular, columns of T” strictly increase because whenever 1 appears above 2 in T, 
these occurrences of 1 and 2 become matched parentheses. Rearranging the unmatched 
parentheses does not affect these symbols, so in the end we get a 2 above a 1 in JT”. Also, 
rows of T’ weakly increase since a strict decrease in some row would be encoded as a matched 
parenthesis pair in w(Z’), which would have also been matched in w(T), implying that T 
had a strict decrease in some row. But T' is a semistandard tableau so this cannot happen. 
Finally, note that the shortest suffix of T’ that does not have partition content (relative to 
the new ordering) is 1111222, where the leftmost 1 corresponds to the rightmost unbalanced 
parenthesis in w(T’). Consequently, [(5431200--- ,T’) = (5432100--- ,7). Observe that 


these two objects have opposite sign, but both have weight xi°ri*a%ar} 28. 
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10.68. Example. Let us compute I(v,T), where 


vy = 5432100--- , 


Moving beads on v according to the word w(T) = 352235112445233111, bead 3 bumps into 
bead 2 when we have scanned the suffix 3111 (which is the shortest suffix without partition 
content). We therefore modify the 2’s and 3’s in the word as follows: 


352235112445233111 
3 223 2 233 
CC 2D DCC 

* OK 
CC 2 CCC 
2332 3 £222 
253325113445222111 


Therefore I(v,T) = (v',T’), where 


v’ = 5423100---, T’= 


Observe that wt(v,T) = wt(v’, T’) = rf xgaSar3a3, sen(v'’, T’) = —sen(v,T), and I(v’,T’) = 
(v,T). 


10.69. Example. Let us compute e when » = (5,4,4,1), v = (3,1), and p = (4,4, 2). 
We draw the semistandard tableaux of shape A/v whose words have the required suffix 
property. The following two tableaux are the only ones, so ee w= 2: 


Let us see how these tableaux correspond to fixed points of J when N = 5. The first tableau 
changes the standard abacus v? = (5432100---) to the abacus (54003002100 ---) counted 
by LAbc(p + 6(5)) by moving bead 1 twice, then bead 2 once, then bead 1 twice, and 
so on. Permuting the labels gives the other 119 signed objects that make up one copy of 
Gy.46(5)(@1,-..,25); for instance, 


3425100-:- , maps to (34002005100-.--). 


On the other hand, the second tableau changes the standard abacus (5432100---) to the 
abacus (54003002100---) via a different sequence of collision-free bead moves: move bead 
1 twice, then bead 2 twice, then bead 1 once, and so on. This pair and its permutations 
produce another copy of the generating function a,,45(5)(1,..-, 25). Dividing by a5 5), we 
conclude that 

S\/v = 2p +ee:. 
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Now let us compute ie The required skew tableaux, which have shape (5, 4, 4, 1)/(4, 4, 2) 


and content (3,1), are: 


So Cx = 2. This illustrates the general symmetry property on t= Cra which is true but 
not immediately evident from our combinatorial description of these coefficients. We prove 


this property in §10.18 when we discuss products of Schur polynomials. 


10.70. Example. For N > 7, let us find the Schur expansion of the skew Schur polynomial 
§(3,3,2,2)/(2,1) in N variables. This expansion is found by enumerating all semistandard skew 
tableaux of shape (3, 3, 2,2)/(2, 1) satisfying the required suffix property. Each such tableau 
of content 44 contributes one term s, to the expansion. The relevant tableaux are shown 
here: 


We conclude that 


$(3,3,2,2)/(2,1) = 18(3,3,1) + 15(3,2,2) + 18(3,2,1,1) + 18(2,2,2,1)- 


10.18 Products of Schur Polynomials 


Given partitions 4 € Pary(m) and v € Parn(n), the product s,(71,...,¢n)S)(®1,..-,2N) 
is a symmetric polynomial, so it can be expressed uniquely in terms of Schur polynomials 
$(@1,...,@n) indexed by \ € Pary(m +n): 


Sey = 2 a(A, ,V)8, for some a(A, p, 7) € R. (10.7) 
r 


By Theorem 10.57, the coefficients here are precisely the Littlewood—Richardson numbers: 


a(A, 1, /) = (SySv,8) = (Sifu) = a 


Since s,,5, = 8,8, (because multiplication of polynomials is commutative), we deduce the 
symmetry property: 
r d 


Cop = Cpu 


We now derive another combinatorial expression for these integers by viewing the prod- 
uct 5,8, as a skew Schur polynomial. We claim that 5,5, = 8./g, where 
a= (11 +11, fr + V2,-.-, 441 +N, M1,-.-,4N), and B = (pj’). 


This follows since the skew shape a/ consists of two disconnected pieces, one of shape v 
and one of shape yp. A semistandard skew tableau of this shape can be formed by choosing 
a semistandard tableau of shape v and independently choosing a semistandard tableau of 
shape jz; thus the result follows from the Product Rule for Weighted Sets. We conclude that 


ch, =c3, (with a, 8 as above). 


This formula is illustrated in the next example. 
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10.71. Example. Let us compute the Schur expansion of s2,1)8(2,1) using the observation 
(2,1) 8(2,1) = §(4,3,2,1)/(2,2). The following skew tableaux have words such that: all suffixes 
have partition content: 


Looking at contents, we conclude that 


$(2,1)$(2,1) = $(4,2) + $(4,1,1) + 28(3,2,1) + $(3,3) + $(2,2,2) + $(2,2,1,1) + $(3,1,1,1) 


Observe that the upper-right portion of the skew tableau could be filled in only one way. So 
we could ignore this part of the tableau and just consider allowable fillings of the lower shape. 
Generalizing this remark leads to the following prescription for the Littlewood—Richardson 
coefficients. 


10.72. Theorem: Alternate Formula for Littlewood—Richardson Coefficients. 
For all partitions A, u,v € Pary, the coefficent ee =o) , 18 the number of semistandard 
tableaux T of shape 4 and content A—v = (A; —- 4%: 1 <i < N) such that w(T) = wi--- wn 
satisfies the following condition: for all k between 1 and n, the exponent vector of the 
monomial in ea a? I[j_, @u, is a partition. (This condition means that for all j,k with 
1<j< Nand1<k <n, if there are a copies of 7 and b copies of 7 + 1 in the suffix 


WkWk4+1°'* Wn; then V5 +a > Vj41 + b.) 


Proof. We already know that Ge = CB where the skew shape a/( consists of an upper 
part of shape v and a lower part of shape p. We also know that CB. , is the number of skew 
tableaux U of shape a/ and content A such that every suffix of w(U) has partition content. 
Consider the last |v| symbols in w(U). The last symbol is the label in the rightmost cell of 
the first row of the skew shape a/. The partition content condition forces this symbol to 
be 1, and then all symbols in the first row of the skew tableau U must be 1. The symbol at 
the end of the second row must be strictly greater than 1, so it is 2 (by the partition content 
condition), and then every entry in the second row must be 2. Proceeding in this way, we 
see that for k between 1 and N, every entry in row k of U must equal k. Equivalently, the 
last |v| symbols of w(U) must be N’% ---2”?1”!. Call this suffix z. 

Now we must fill the lower part of the shape a/3 by choosing a semistandard tableau 
T of shape py. Because the upper part of U has content v, the content of the entire skew 
tableau U is A iff the content of the lower part T is \ — v. The other condition imposed on 
T is that for every suffix y of w(T), the suffix yz of w(U) has partition content. Given the 
formula for z above, this condition is equivalent to the condition on w(T) in the theorem 
statement. O 


DT 


Summary 


Table 10.1 contains formulas derived in this chapter for computing with antisymmetric and 
symmetric polynomials. 
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TABLE 10.1 


Formulas for manipulating antisymmetric and symmetric polynomials. 


The Pieri Rules: 


Determinant formula for s): 


Schur expansion of power-sums: 


Power-sum expansion of 
Schur polynomials: 


Formulas for skew Schur polynomials: 


Inverse Kostka formulas: 


A475 
@)+46(N)Pk = y sgn(P)ag45(N) 
B: B/X is a k-ribbon R 
An+6(N)Ck = 48+46(N) 
B: B/X is a vertical k-strip 
ay46(N) hE = S 48+5(N) 
B: B/X is a horizontal k-strip 
5)Pk = y sgn(F)s,g 


B: B/X is a k-ribbon R 
Xr 
SuPa = 5 xr/#sy 
Xr 


_ angacny _ det[ast* rcs jen 
a5(N) det[x “<i j<n 
Pa = S- XaSr 
Oo 
x 
n=) oP 
Lb 
" A/ te 
Xv 
5d/u = F Pv 
Vv 
Vv 
(Sy pi, f) = (Six Sod) for 4 € An 
W Sd/u = BS)! /p! 


8y/p = det[hy,-p;+5-i)1<i,7<ea) 
Sr/p = det ley, —pi45-s)i<ig<rr 


Sd/m = Luv Cu,y $v 
Soe = Sy es 
Ky = x sen(S) 


SESRHT(u,) 
My = Lx KY} Sp 
Sn =n Ky hx 
Sut = 21 KA yea 
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e Unlabeled Abaci. An abacus is a function w : Z > {0,1} with w(t) = 1 for all small 
enough 7 and w(j) = 0 for all large enough j. Justification of abaci gives a bijection to 
pairs (m, A) € Z x Par. The inverse bijection can be computed by traversing the frontier 
of dg(A), converting north steps to beads (1’s) and east steps to gaps (0’s), and using m 
to decide which step on the frontier corresponds to position 0 of w. 


e The Jacobi Triple Product Identity. Abaci can be used to prove 


So grim D2gm = [] (1+uq”) [[ +0779") ) TN (1— 9") 
mez n=l n=0 n=1 
One consequence is the formula [~~ ,(1 — gq") = Si ep(— 1th, 


e k-Cores and k-Quotients. Given a partition 4, repeated removal of border ribbons of size 
k (in any order) leads to a unique partition from which no further ribbons of this kind 
can be removed. This partition is called the k-core of uw. We can also find the k-core by 
converting 4 to an abacus, decimating the abacus to give a k-runner abacus, justifying all 
runners, and converting back to a partition. Each ribbon removal corresponds to moving 
one bead one step to the left on the k-runner abacus. Justifying each separate runner 
on the k-runner abacus for ys produces the k-quotients (v°,...,v*~1) of yu. Alternatively, 
dg(v’) can be found by taking the cells of dg(j) lying due north and due west of steps of 
k-content i on the frontier of jz. We get a bijection A, : Par > Core(k) x Par*® by mapping 
Lt to its k-core and k-quotients. 


e Labeled Abacit and Antisymmetric Polynomials. A polynomial f in N variables is anti- 
symmetric iff interchanging any two adjacent variables changes the sign of f. For each 
= (1 > fe > ++: > pn > O), the polynomial a,(r1,...,2n) = det [a ]i<ij<n is 
antisymmetric. Writing 6(N) = (N —1,N — 2,...,2,1,0), the set {a,45;y) : A € Parw} 
is a basis for the vector space Ay of antisymmetric polynomials. Division by as(y) = 
Hi<icj<w (vi — 23) gives a vector space isomorphism from Ay to Ay sending a)+5(.y) to 
the Schur polynomial s, = a)+5()/a@5(~)- To model the terms in a)+5), we use the N! 
labeled abaci consisting of beads 1,2,...,.N (in any order) at positions given by \+d(JV). 


e Rim-Hook Tableauz. A rim-hook tableau of shape A/j and content a is obtained by 
enlarging the diagram of jz using border ribbons of lengths a1, a2,... (in this order) until 
the diagram of \ is obtained. The set of such tableaux is denoted RHT(A/j, a). A ribbon 


occupying r rows has sign (—1)"~!, and the sign of a rim-hook tableau is the product of 


r/o d/ d/ 


the signs of its ribbons. We write Xa!" = )rerwt(r/p,0) S8u(T’). We have xa'" = xg 


whenever sort(a) = sort(). 


e Interactions between Abaci and Tableaux. One can give combinatorial proofs of several 
identities in Table 10.1 by using the word of a tableau to encode bead motions on abaci. 
When these motions lead to bead collisions, one obtains two objects of opposite sign and 
equal weight that cancel terms on one side of the formula to be proved. Objects with no 
collisions are fixed points that can be reorganized to give the other side of the formula. 


e The Inverse Kostka Matriz. A rim-hook tableau is called special iff each ribbon in the 
tableau begins in the leftmost column; SRHT(1, 4) is the set of such tableaux of shape ju 
and content a with sort(a) = A. Letting kK), = Do sesrur(y,a) 88n(S), we have KK’ = I. 


e Littlewood—Richardson Coefficients. The scalars Cy = ey appearing in the Schur ex- 


pansions of s)/, and s,s, count semistandard tableaux T of shape A/v oe content pu 
such that every suffix of the word of T has partition content. The scalars c? also count 
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semistandard tableaux T of shape ys and content \—v such that w(T) = w1--++ Wp, satisfies 
the following condition: for all k between 1 and n, the exponent vector of [] j a? | eee 
is a partition. 


(I 


Exercises 


10-1. Let w = ---1101101110101001100---. Compute wt(w) and J(w). 

10-2. Compute U(—1, 4) for each ps € Par(5). 

10-3. In the computation of U(m,w) in Remark 10.4, describe in detail how to use m to 
decide which symbol in the bead-gap sequence is wo. 

10-4. Given ys € Par, what is the relationship between the abaci U(—1, u) and U(—1, 1’)? 
10-5. Show that the abacus w in Example 10.2 and its justification J(w) have the same 
weight if we use the weights defined in the proof of Theorem 10.5. 

10-6. Show how to deduce Euler’s Pentagonal Number Theorem as an algebraic consequence 
of the Jacobi Triple Product Identity. 

10-7. Fill in the details in the proof of Theorem 10.6. 

10-8. Use Theorem 10.5 to simplify the product []7-_)(1 — °"*1)-1 J] (1 — 2°" t4)-1 
appearing in one of the Rogers-Ramanujan Identities. Can you give a direct proof of the 
resulting identity using abaci? 

10-9. Use Remark 10.4 to find a bijective proof of Theorem 10.5 that makes no reference 
to abaci, instead using combinatorial operations on partition diagrams and their frontiers. 
10-10. Complete the proof of Theorem 10.12 by verifying that D;,(w) € Abc”, I,(v) € Abe, 
and D; and J, are two-sided inverses. 

10-11. (a) Verify that the 3-core of uw = (10,10, 10,8,8,8, 7,4) is (1,1) by removing border 
3-ribbons from yp in several different orders. (b) Use the 3-runner abacus encoding ps to 


determine exactly how many ways there are to change p into (1,1) by removing an ordered 
sequence of border 3-ribbons. 


10-12. Let yp = (8,7,6,4,4,4,3,1,1,1). Use abaci to compute the k-core and k-quotients of 
uw forl<k<6. 

10-13. Find all integer partitions that are 2-cores, and draw some of their diagrams. 
10-14. Find all 3-cores with at most 8 cells. 

10-15. Verify the assertion in the last sentence of Example 10.20. 

10-16. Let pw = (8,8,8,8,8,8,8,8). (a) Use abaci to compute the k-core and k-quotients of 
p for 3<k <8. (b) Use the construction at the end of §10.5 to compute the k-quotients of 
pw (for 3 < k < 8) directly from the diagram of wu. 

10-17. For k = 3,4,5, compute the k-quotients of = (6, 6,6, 3,3, 2,2, 2,1, 1) without using 
abaci. 

10-18. Consider the construction at the end of §10.5 for computing k-quotients of uw. Show 
that the hook-length of each unerased cell is divisible by &, and these are the only cells in 
the diagram of js whose hook-lengths are divisible by k. 


10-19. For each k > 1, find a formula for the generating function yo neCorath qltl, 


10-20. Given that pz: has k-core p and k-quotients v°,...,v*~!, find a formula for the number 
of ways we can go from p to p by removing an ordered sequence of border k-ribbons. 
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10-21. Compute a,,(%1, 22,73) and a)+5(3)(#1, 22,13) for p = (6,3,1) and A = (2,2, 1). 
10-22. Verify by direct calculation that, for N = 3 and \ = (2, 1,0), a@,45(n) is divisible by 
a5(N) and €45(N)/5(N) = $y(@1, sae , ZN). 

10-23. Verify that Ay and A‘, are subspaces of R[x1,...,2~], and that the map sending 
f € An to fag) is linear. 

10-24. (a) Show that the product of two antisymmetric polynomials is symmetric. (b) Show 
that the product of a symmetric polynomial and an antisymmetric polynomial is antisym- 
metric. 


10-25. Define a map T': R[a,...,¢n] > Ri1,..., 2] by setting 


1 
T(f)= mal S> sgn(w)f(@u(1)1-+ +> ®w(N))- 


weSn 


Show that T is a linear map with image Ay whose restriction to Ay is the identity map. 
Can you describe the kernel of T’? 

10-26. Let v be the labeled abacus v = 0041000300502600 ---. Compute wt(v), w(v), pos(v), 
and sgn(v). For which 4 is v in LAbc(A + 6(6))? 

10-27. Draw all the labeled abaci in LAbc(6,5,1), and compute the sign of each abacus. 
10-28. Using N = 6 variables, compute all terms in: (a) a(4,2,1)+45(6)P43 (b) @(3,3,3)-+6(6)P33 
(c) @(1,1,1,1,1)+6(6)P2- How would the answers change if we changed N? 

10-29. Let v = 0310040206500 --- € LAbc(A + 6(6)) and k = 4. For 1 <i < 6, compute 
I(v,t) where I is the involution in the proof of Theorem 10.39. For any fixed points that 
arise, compute v* and indicate which border 4-ribbon is added to dg(A) in the passage from 
v to v*. 

10-30. Using N = 6 variables, compute all terms in: (a) a(4,2,1)+6(6)e3; (b) @(3,3,3)-46(6) €23 
(c) (5,4,3,1,1)+6(6)€4- How would the answers change if we changed N’? 

10-31. Let v = 0310040206500--- € LAbc(A + 6(6)). Compute I(v,S) for S = {2,5,6}, 
S = {1,4,5}, S = {1,3,4}, and S = {3,4,6}, where J is the involution in the proof of 
Theorem 10.42. For any fixed points that arise, compute v* and indicate which vertical 
strip is added to dg(A) in the passage from v to v*. 

10-32. Using N = 6 variables, compute all terms in: (a) @(4,2,1)45(6)h3; (b) @(3,3,3)-46(6)h33 
(c) a(5,4,3,1,1)-+6(6)4. How would the answers change if we changed N? 

10-33. Let v = 0310040206500 --- € LAbc(A + 6(6)). Compute I(v, M) for M = [1,1, 4,5], 
M = [2,2,5,6], M = [2,4,5,5], and M = [1,2,3,4], where J is the involution in the proof of 
Theorem 10.44. For any fixed points that arise, compute v* and indicate which horizontal 
strip is added to dg(A) in the passage from v to v*. 

10-34. Explain in detail why the bead motion rule in §10.10 leads to the addition of a 
horizontal k-strip to the shape A, assuming no bead collision occurs. 

10-35. In the proof of Theorem 10.44, check in detail that I reverses signs, preserves weights, 
and is an involution. 

10-36. Reprove Theorem 10.45 by comparing the symmetric and antisymmetric Pieri Rules 
for multiplication by ex. 

10-37. Expand the following symmetric polynomials into linear combinations of Schur poly- 
nomials: (a) §(3,3,2)P33 (b) P(3,1,3)3 (C) $(2,2)P(2,1)- 

10-38. Compute the coefficients of the following Schur polynomials in the Schur expansion 
of p(3,3,2,1): (a) (9); (b) 8(3,3,3)3 (€) $(4,4,1)3 (d) sca). 

10-39. Show that, for \ € Par(n), Xfin) = |SYT())|. 


Abaci and Antisymmetric Polynomials AT9 


10-40. Write 5/321) as a linear combination of power-sum polynomials. 

10-41. For each ps € Par(4), write p,, in terms of Schur polynomials, and write s,, in terms 
of power-sum polynomials. 

10-42. Let I be the involution in §10.13. For each (v, 7’) € X given below, compute I(v, T). 
If (v, 7) is a fixed point, compute v* € Y. 


(a) v = 5432100---, T= 


(b) v = 2431500---, T=[1 111 


(c) v = 3452100---, T= 


10-43. Let J, X, and Y be defined as in §10.13. Take N = 3 and A = (2,1,0). List all the 
elements of X and Y, compute the action of J on X, and show how the fixed points of I 
map bijectively to Y. 

10-44. Verify all the assertions stated before Example 10.54. 

10-45. Express $(4,3,1)/(2,1) a8 a linear combination of power-sums. 

10-46. Explain why the formulas w(h,,) = e, and w(e,) = h, are special cases of Theo- 
rem 10.59. 

10-47. For N > k, two linear operators S$ and T on AX, are called adjoint iff (S(f),g) = 
(f,T(g)) for all f,g € A‘. Prove that this condition holds for all such f,g iff it holds for 
all f in some basis of A‘, and all g in some (possibly different) basis of A‘,. 

10-48. Write the following Schur polynomials in terms of the complete symmetric polyno- 
mials h,,: (a) 8(5,3); (b) $(4,1,1)3 (©) 8(6,5,2,2)- 

10-49. Write the following Schur polynomials in terms of the elementary symmetric poly- 
nomials e,,: (a) $(2,2,2,2); (b) (3,2,1)3 (€) §(4,2)- 

10-50. Write the skew Schur polynomial s,4,4,3)/(2,1,1) in terms of the following bases: (a) h,,; 
(b) eu; (€) Pus (d) my. 

10-51. Modify the definition of the involution used in the proof of Theorem 10.60 as follows. 
If two or more paths in (w,p1,..-,Pn) intersect, choose 7 minimal and then j minimal such 
that p; and p, intersect. Let (u,v) be the earliest vertex on p; that is also a vertex of pj, 
and switch the initial segments of these two paths as in the original proof. Show that the 
map just defined is not always an involution. 

10-52. Can you find a way to rephrase the combinatorial proof of Theorem 10.60 in terms 
of abaci? 

10-53. Enumerate special rim-hook tableaux to compute a, , for all partitions A, u with 
at most four cells. Use this to confirm by direct calculation that KK’ = I. 

10-54. Find and prove a Pieri-type rule giving the Schur expansion of a product s,m). 
10-55. Let K" be the matrix defined combinatorially by kK ,, = )o sesrut(y,r) 832(9)- Find 
involutions that prove KK’ = I. 

10-56. Let K’ be the inverse Kostka matrix, defined using special rim-hook tableaux. Can 
you prove the identity K’'K = I combinatorially? 

10-57. Let I be the involution in the proof of Theorem 10.66. (a) Compute I(v°, T), where 


vp = 5432100-:- , 
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(b) Answer (a) if the last 1 in the top row of T is changed to a 2. (c) Answer (a) if the last 
3 in row 2 of T is changed to a 2. 

10-58. Compute c),, and c) ,, using Theorem 10.66, where: (a) \ = (5,3,1,1), w = (3,1), 
vy = (3,2,1); (b) A= (5,4, 4,3,1), w = (4,3,3,1), v = (3,1,1,1). 

10-59. Repeat the previous exercise, but use Theorem 10.72 to compute the Littlewood— 
Richardson coefficients. 

10-60. Continuing Example 10.69, find the expansion of §(5 4.41) /(3,1) Into a sum of Schur 
polynomials. 

10-61. Expand the following skew Schur polynomials into sums of Schur polynomials: 
(a) 8(3,3,3)/(2,1)3 (D) 8(5,4)/(2)3 (€) $(4,3,2,1)/(1,1,1): 

10-62. Expand 5/3 2) (2,2) into a sum of Schur polynomials. 

10-63. In the Schur expansion of 8/3,9,1,1)* find the coefficients of: (a) 8(5,4,2,2,1); 
(b) §(5,3,3,1,1,1)3 (c) §(4,3,3,2,1,1)- 

10-64. Give a combinatorial proof of Theorem 10.72 based on abaci. 


DS 


Notes 


The proof of the Jacobi Triple Product Identity in §10.2 is adapted from a lecture of Richard 
Borcherds. One source for material on unlabeled abaci, k-cores, and k-quotients is the 
book by James and Kerber [67]; for labeled abaci, see [79]. Gessel and Viennot have used 
intersecting lattice path models to prove many enumeration results [46]. The combinatorial 
interpretation of the inverse Kostka matrix is due to Egecioglu and Remmel [29]. The proof 
of the Littlewood—Richardson rule given in §10.17 may be viewed as a combinatorialization 
of the algebraic proof in [106]. Many other proofs of this rule may be found in the literature; 
see, e.g., the bibliographic notes in [41] and [121, Ch. 7]. 
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Algebraic Aspects of Generating Functions 


In Chapter 5, we gave an introduction to generating functions emphasizing their applications 
to combinatorial problems. This chapter takes a closer look at some algebraic aspects of 
formal power series. We study some new operations on formal power series such as infinite 
sums, infinite products, formal exponentials, formal logarithms, and formal composition. 
To define these operations, we need to use the analytic concepts of limits and continuity for 
formal power series. A major goal of this chapter is to develop algebraic and combinatorial 
formulas for the coefficients in the multiplicative inverse and the compositional inverse of 
a formal power series, when these inverses exist. We also prove some theorems regarding 
partial fraction decompositions and recursions with constant coefficients, along with infinite 
versions of the Sum Rule and Product Rule for Weighted Sets. These results were used 
informally in Chapter 5. 

Throughout this chapter, let A denote a field of characteristic zero (as defined in the 
Appendix). Assuming that the characteristic is zero ensures that n~+ exists in K for each 
positive integer n. This enables us to define power series such as e* = )7*°_, 2”/n!. 


11.1 Limit Concepts for Formal Power Series 


This section studies various limit concepts for formal power series, including the ideas of 
infinite sums and products of formal series. We begin by reviewing the definitions of the 
algebraic operations on formal power series from Chapter 5. First recall that an element F’ 
of a formal power series ring K'[[z]] is defined to be an infinite sequence F' = (a, : n € Zs0) 
where each a, belongs to the field K. We often write F = F(z) = eae Anz” and call a, the 
coefficient of z” in F. But, at the moment, the summation symbol appearing here is merely 
notation designed to suggest the analogy between formal power series and polynomials; it 
does not mean that we are adding up infinitely many terms. Similarly, the formal power 
series F' is not a function of the variable z. The powers of z are notational placeholders 
used to display the coefficients a,,. To emphasize this point, we may refer to z as a formal 
indeterminate or formal variable. 

Given F = yor yanz” and G = Yor 9 baz” in K[[z]], we have F = G (equality of 
formal series) iff an = bp for all n € Zo. We define the sum F + G = 0 y(an + bn)2” 
and the product FG = yy 9 nz”, where cn = opp Akbn—k. Note that each particular 
coefficient in F'+ G or FG can be computed using only finitely many operations in the 
field K. We use the notation F|,x to denote az, the coefficient of z* in F. We can identify 
each scalar c € K with the constant power series (c,0,0,...) = c+0z+02?+---. Then 
scalar multiplication satisfies cP’ = saa, Ca;,z", which is a special case of the formula for 
multiplying two series. The constant 0 is the additive identity in A'[[z]], and the constant 
1 is the multiplicative identity. Similarly, for each m € Zo, z™ is the formal power series 
(0,0,...,1,0,...), where the 1 is in the position indexed by m. We have z"F = 0° 9 dnz”, 
where d, = 0 for 0 < n < m, and dy, = dn—m for n > m. One may verify that K[[z]] is 
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an integral domain (a commutative ring with 1 4 0 having no zero divisors). K’[{z]] is also 
a vector space over Kk and a K-algebra. (See the Appendix for the definitions of these 
algebraic structures.) 

We define formal polynomials to be formal power series F =). 9 a2” such that all 
but finitely many coefficients a, are zero. Let K[z] denote the set of all formal polynomials 
with coefficients in AK. It can be checked that K[z] is a subring, subspace, and subalgebra 
of K[[z]]. 


11.1. Definition: Degree and Order. Given a nonzero F € K[z], the degree of F (de- 
noted deg(F’)) is the largest n € Zso such that F'|,» #0. Given a nonzero G € K'[z]], the 
order of G (denoted ord(G)) is the smallest n € Zso such that Gl.» 4 0. Note deg(0) and 
ord(0) are not defined, and deg(G) is not defined if G is not a polynomial. 


One readily proves the following properties of degree and order. 


11.2. Theorem: Degree and Order. Let P,Q € K[z] and F,G € K{[z]] be nonzero. 
(a) deg(PQ) = deg(P) + deg(Q). If P+ Q #0, then deg(P + Q) < max(deg(P), deg(Q)). 
(b) ord(F'G) = ord(F’) + ord(G). If F + G 4 0, then ord(F + G) > min(ord(F’), ord(G)). 
(c) For all m € Zso, deg(P”) = mdeg(P) and ord(F™) = mord(F). 


Now we define limit concepts for formal power series. 


11.3. Definition: Limit of a Sequence of Formal Power Series. For each m € Z5o, 
let Fi, € K[[z]] be a formal power series. We say that F € K|[z]] is the limit of the 
sequence (F;,,), denoted limm_—+oo Fm = F or (Fm) — F, iff for each k € Zso, there exists 
M (depending on k) such that for all m > M, Fi, |, = Fee. 


Intuitively, (Fi) + F means that as m increases, the coefficient of any fixed z* in 
Fi, eventually stabilizes at the value F'|,.. For example, take F,, = z™ for each m. Then 
limm—+oo 2” = 0. To prove this, fix k € Zs and choose M = k +1. For all m > M, the 
coefficient of z* in z™ is 0, which equals the coefficient of z* in the zero series. (Remember 
that z is still a formal indeterminate here. Compare this result to the analytic theorem that 
for all z € C with |z| < 1, limm+oz™ = 0.) More generally, for any formal series G with 
ord(G) > 0, one may check that (G'™) — 0. 

In calculus, a function f : R > R is continuous iff whenever (x;,) is a real sequence 
converging to x € R, the sequence (f(z,)) converges to f(x). This suggests the following 
formal version of continuity. 


11.4. Definition: Formal Continuity. Suppose p: D > K|[z]] is a function with domain 
D. When D C K|[z]], we say p is continuous iff (Fi,) + F implies (p(F;)) > p(F’) for all 
Fy, F € D. When D C K|[z]]x A [[z]], continuity of p means that for all (Fi, G), (F,G) € D, 
if (Fy) — F and (Gr) —+ G then p( Fr, Gx) -> D(F, G). 


The next theorem lists some properties of formal continuity. 


11.5. Theorem: Formal Continuity. 
(a) The addition operation on K'[[z]], sending (F',G) to F' + G, is continuous. 
(b) The multiplication operation on K[[z]], sending (FG) to F’- G, is continuous. 
(c) For fixed G € K|[z]], the maps Ac(F) = F+G and Mc¢(F) = F - G are continuous. 
(d) For each n > 0, the coefficient extraction operation (sending F' to F'|,») is continuous. 
(e) The composition of two continuous functions is continuous when defined. 
Proof. We sketch the proof of (b), leaving the other parts of the theorem as exercises. 
Suppose Fy, Gy, F,G € K|[z]], (F.) > F, and (G;,) > G; we must show (F,-G,) > F-G. 
Fix n > 0. Since (Fi) + F’,, there exists K1 so that for all k > Ky and all € {0,1,...,n}, 
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Fy|.i = Py (Exercise 11-9). Since (G;) + G, there exists K2 so that for all k > K» and all 
i€ {0,1,...,n}, Gxl,i = G_i. It follows that for all k > max{ ky, K>}, 


(FkGa)len = > FalaGelen-i = D0 F 


1=0 1=0 


Glen = (FG)|an. 


This proves that (F,.G,) > FG. O 


Now let us consider infinite sums and products. We can define the sum or product of 
finitely many formal power series by recursion. For instance, Te 2¢ Fy, = Fo and [hy Fy, = 
(To Fe): Fm-+1- We use limits to define the sum or product of infinitely many formal power 
series. 


11.6. Definition: Infinite Sums and Products of Formal Series. For each m € Z5o, 
let Fi, € K|[z]] be a formal power series. (a) We say that }7*°_, Fim converges to the 
sum F € K|[z]] iff limy +o > F,, = F. (b) We say that []*_, Fm converges to the 


product F € K{[z]] iff limy oo eee Fi, = F. If these limits do not exist, we say that the 
corresponding infinite sum or product diverges. 


For example, fix G = (b, : n > 0) = 9 bnz” € K[[z]]. Defining F,, = bz” for 
each m > 0, one readily checks that peau Fy, = G. This means that our original notation 
pia b,z” for G agrees with the infinite summation process defined above. 

In contrast to the situation for real or complex series, there is an easily tested criterion 
for the convergence of an infinite sum or product of formal power series. This criterion uses 
the following limit notation: given a sequence (k, : n > 0) of integers, we write (Kp) —- oo 
or limn—+oo kn = 00 to mean that for all M € Zso, there exists N € Zo such that for all 
n>N,k,>M. 


11.7. Theorem: Convergence Criterion For Infinite Sums and Products in K|[z]]. 
For each n € Zso, let F, € K[[z]] be a nonzero formal series. 

(a) 07.9 Fr converges iff (ord(F;,)) + oo. 

(b) []7-.,(1 + F,) converges iff some F,, = —1 or (ord(F;,)) > oo. 


In this theorem, we assume that every F;, is nonzero. This can always be arranged by 
dropping summands equal to 0 in )>°°_9 Fy, or dropping factors equal to 1 in []7*_)(1+ Fn); 
this adjustment does not affect convergence of the sum or product (Exercise 11-13). 

We prove the backward direction in (b), leaving the rest of the proof as an exercise. 
If some F, = —1, then all partial products from some point on are zero, so |G eels Fy 
converges to zero. Now assume F;, 4 —1 for all n, and (ord(F;,)) + oo. For each m > 0, let 
Pm = [[}"-9 Fn be the mth partial product of the factors F;,. Fix k € Zo; we must show 
that the coefficient P,,|,« eventually stabilizes as m increases. Because (ord(F;,)) — 00, 
there exists N so that for all n > N, ord(F,,) > k. We show by induction on m that for 
allm > N, Pyle» = Pn|zx. This certainly holds for m = N. For the induction step, fix 
m > N, assume P,,|,« = Py|,«, and prove Py41\2* = Py|z«. By the recursive definition of 
finite products, P41 = Pm(1+ Fm4+1). Taking the coefficient of z*, we get 


k 


Pm+ilzk = Ze Prelgea(l + Patt) 
i=0 


zie 


Because m+1 > N, ord(Fin41) > &k, which means F;,+41|,: = 0 for all i between 0 and k. 
So (1+ Fin41)|-: is 1 for 7 = 0 and is 0 for 0 < i < k. Putting these values into the sum 
above, we find that Pr4i|,*. = Pm|z* = Pn|_«, as needed. 


484 Combinatorics, Second Edition 


11.8. Example: Formal Geometric Series. Suppose G € K|[{[z]] is any nonzero formal 
series with zero constant term, and let ord(G) = d > 0. We claim )>~_, G™ converges in 
K|[z]]. For, ord(G™) = mord(G) = md, and lim»... md = oo since d > 0. The significance 
of the power series )>°_, G™ is revealed in §11.3. 


11.2 The Infinite Sum and Product Rules 


The Sum and Product Rules for Weighted Sets (discussed in §5.8) provide combinatorial 
interpretations for finite sums and products of formal power series. This section extends 
these rules to the case of infinite sums and products of formal series. These more general 
rules were already used informally in our discussion of generating functions for trees (§5.10) 
and integer partitions (§5.15). 


11.9. The Infinite Sum Rule for Weighted Sets. Suppose 5S; is a nonempty weighted 
set for each k € Zso, S,.1S; = 0 for all j Ak, and S = Uf Se. Assume that for all k 
and all u € Sz, wts(u) = wts,(u). For each k, let minwt(S;,) = min{wtgs,(u) : u € S,}. If 
limy_+o0 minwt(S;) = oo, then GF(S; z) = Pp GF (Sk; z). 


Proof. We have minwt(S;,) = ord(GF(S;; z)) for each k, so the hypothesis of the theorem 
means that (ord(GF(S,;z))) + oo. By Theorem 11.7(a), it follows that 77°.) GF(Sk; z) 
converges to some formal power series H. We need only show that for each n € Zso, 
GF(S; z)|2. = H|.». Fix n, and then choose K so that for all k > kK, minwt(5;,) > n. Now 
define H* = Sv, GF(Sx;z) and S* = Ui, Sp. Since all objects in Sp41,Sk42,--. have 
weight exceeding n, we see that {u € S : wt(u) = n} = {u € S* : wt(u) = n}. Using this 
fact and the Weighted Sum Rule for finitely many summands, we conclude 


GF(S; z) 


aS GF(S*; zen = H* 


eo 


The final equality follows since GF(5;,; z)|2. =0 for allk > K. O 


11.10. The Infinite Product Rule for Weighted Sets. For each k € Zs1, let S; be a 
weighted set that contains a unique object o, of weight zero, which we call the default value 
for choice k. Assume limg-,o, minwt(S, — {o%}) = co. Suppose S is a weighted set such 
that every u € S can be constructed in exactly one way as follows. For each k > 1, choose 
ur © Sz, subject to the restriction that we must choose the default value o, for all but finitely 
many k’s. Then assemble the chosen objects in a prescribed manner. Assume that whenever 
u is constructed from (ux : k > 0), the weight-additivity condition wts(u) = oP, wts, (ux) 
holds. Then GF(S; z) = []72., GF(Sx; 2). 


Proof. The assumed condition on minwt(S; — {o,}) means that (ord(GF(S;; z)—1)) > ~, 
so that [[~°., GF(Sz;z) converges to some formal series H by Theorem 11.7(b). Now fix 
n € Zso, and choose K so that for all k > K, minwt(S; — {o,}) > n. Define H* = 
ies GF(S;;z), and let S* be the subset of u € S that can be built by choosing uw, € 
Si,...,uK € Sx arbitrarily, but then choosing uz, = o, for all k > K. The weight-additivity 
condition implies that {u € S : wt(u) =n} = {u € S* : wt(u) = n}. Using the Weighted 
Product Rule for finitely many factors, one may now check that 


GF(S; z) 


gh GF(S"*; z)| 2» = H* 


a= 


oe Oo 
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11.3. Multiplicative Inverses of Formal Power Series 


In a field Kk, every nonzero element x has a multiplicative inverse, which is an element y © K 
satisfying ry = lx = yx. In commutative rings that are not fields, some elements have 
multiplicative inverses and others do not. One can ask for characterizations of the invertible 
elements in such a ring. For example, it can be shown that for the ring Z, of integers 
modulo n, k € Z,, is invertible iff gcd(k,n) = 1. The next theorem gives characterizations 
of the invertible elements in the rings K[z] and K[[z]]. Roughly speaking, the theorem says 
that almost all formal power series can be inverted, whereas almost all formal polynomials 
cannot be inverted. 


11.11. Theorem: Invertible Polynomials and Formal Power Series. (a) A polyno- 
mial P € Kz] is invertible in K[z] iff deg(P) = 0 (i.e., P is a nonzero constant). (b) A 
formal power series F' € K|[z]] is invertible in A[{z]] iff ord(f’) = 0 (ie., F has nonzero 
constant term). 


Proof. (a) First assume P is a nonzero constant c € K. Since K is a field, c~! exists in K, 
and c~' is also a multiplicative inverse of c in K[z]. Conversely, assume P € K[z] has an 
inverse Q € K[z], so PQ = 1. We must have P 4 0 4 Q, since 1x #4 Ox. Taking degrees, 
we find that 0 = deg(1) = deg(PQ) = deg(P) + deg(Q). Since deg(P) and deg(Q) are 
nonnegative integers, this forces deg(P) = 0 = deg(Q). This means that P (and Q) are 
nonzero constants. 

(b) First assume F is invertible in A’[[z]] with inverse G € K[z]]. Taking orders in the 
equation FG = 1, we get 0 = ord(1) = ord(F'G) = ord(F) + ord(G), forcing ord(F’) = 0 = 
ord(G). In turn, this means that F’ (and G) have nonzero constant terms. 

Conversely, assume F = 30° 9 an2" with ap 4 0. We construct a multiplicative inverse 
G =O 9 Un2” for F recursively. For any G € K[[z]], we have FG = 1 iff (FG)|,0 = 1 and 
(FG)|-. = 0 for alln > 0. So FG = 1 iff (uz : n > 0) solves the infinite system of equations 


doug = 1; adoUn + a,Un—1 + GQUn—2 +--+ +anuo = 0 for all n > 0. (11.1) 


Since ag # 0, ag ' exists in the field K. So we can recursively define ug = ao ' and (assuming 


Uo,-++,Un—1 have already been found) u, = ap" sear dkUn—~ € K. It can be checked by 
induction that the sequence (up :m > 0) does solve (11.1), so FG = 1 does hold for this 
choice of the w,,’s. O 


The preceding proof provides a recursive algorithm for calculating the coefficients of F~! 
from the coefficients of F. Note that u, = F~+|,n only depends on the values of ay, = F|,« 
for 0 < k < n. Next we develop a closed formula for the coefficients of F~! based on a 
formal version of the geometric series. 


11.12. Theorem: Formal Geometric Series. For all G € K|[z]] satisfying G|,o0 = 0, 


Co 


(1-G@) t= 5° a. 


m=0 
Proof. Fix G € K[[z]] with G|,o = 0. For each m > 0, the distributive law shows that 
(1-G)(1+G+---+G™) = (1+G+G?+---+G™)—(G+G?+---+G"+G""") =1-G™"t, 


since all intermediate terms cancel. (One can prove this equation more rigorously by in- 
duction on m.) Let Hy = pee G™ and H = >~_,G™; Example 11.8 shows that this 


m=0 m=0 
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infinite series converges. By continuity of the map p(F’) = (1 — G)F’,, we now compute 


(1—G)H =(1-G) lm Ay = lim [(11-—G)Hy] = lm [1 -G*"] =1-0=1, 
M-co M- co M->co 


note (G“@+1) —s 0 as M > oo since ord(G) > 0 or G = 0. Thus H is the multiplicative 
inverse of 1 — G, as needed. O 


We can use this theorem to invert any formal series F = )>° 9 an2” with ao # 0. To 
do so, write F = ao(1— G) where G = 0°, (—an/ao)2”. So 1/F = ag! _, G™. Next, 
we can use Theorem 5.35 to give an explicit formula for each coefficient G'™|,.. Defining 
by = Gl.» = —ay,/ao for each n > 0, the formula is 


Gn | an = Ss bi, bi, ieee bi, - (11.2) 
(i1,i2,..-,im)EZZ: 


iy tiete-+im=n 


Then the coefficient of z” in 1/F is ap ' times the sum of these expressions over all m. For 
fixed n, it suffices to let m range from 0 to n. To summarize, we have proved the following. 


11.13. Theorem: Coefficients of Multiplicative Inverses. For all F = S>~* , an2” in 
K[[z]] with ap 4 0, 


7 a (1 
ad aD Dees S 2 dig dig ++i, (11.3) 


m=0 “0 (i1,42,--,4m) EZ: 
ty ttet:-+im=n 


There is another formula for the multiplicative inverse based on symmetric functions. 
Let A be the ring of abstract symmetric functions with coefficients in K (see §9.28). For 
n € Zyo, let e, and h, denote the elementary and complete symmetric functions of degree 
n, and let e9 = Ao = 1. Define E(z) = 07g enz” and A(z) = 72 9(-1)"hn2”, which are 
formal power series with coefficients in A. The key observation is that E(z)H(z) = 1. This 
holds because the quantities a, = e, and u, = (—1)"hy, solve the system of equations (11.1), 
by Theorem 9.81 (also compare to the finite version (9.9)). 

Suppose we are trying to invert a formal series of the form F(z) = 1+)>7~, anz” where 
each a, is in AK. Recall that (en, : n € Zyo) is algebraically independent over A’. So there 
exists a unique evaluation homomorphism ¢: A — K such that ¢(e,) = a, for all n > 0. 
It can be checked that this map induces a K-algebra homomorphism ¢* : A[[z]] > K'[[z]] 
given by 


e (> ia = S- (fn)Z” for all fn EA. (11.4) 


n=0 n=0 


Now define b, = $((—1)"hn) for each n > 0, and set G(z) = 1+ 307°, bn2”. Applying ¢* 
to the equation E(z)H(z) = 1 in Al[z]] produces the equation F'(z)G(z) = 1 in K[[z]], so 
G = F~'. In summary, we can find the coefficients in the multiplicative inverse of F if we 
can figure out how the homomorphism ¢ acts on the symmetric functions h,,. It is possible 
to express each h,, as a specific linear combination of the symmetric functions e,, which 
are products of e;’s (see Exercise 11-40). Then each b, = (—1)"¢(h,) is determined by 
the values a, = (ex) and the fact that ¢ is a K-algebra homomorphism. Some algebraic 
manipulation eventually reproduces Formula (11.3). 

For yet another approach to multiplicative inversion based on the formal exponential 
function, see Theorem 11.27(b) below. 
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11.4 Partial Fraction Expansions 


This section proves the existence and uniqueness of partial fraction decompositions for ratios 
of complex polynomials. Suppose f and g are polynomials with complex coefficients such 
that g has nonzero constant term. We saw in §11.3 that g (viewed as a formal power series) 
has a multiplicative inverse, so we can write f/g = +t b,z” for some b,, € C. The partial 
fraction decomposition of f/g leads to explicit formulas for the coefficients b,,. Our starting 
point is the Fundamental Theorem of Algebra, which we state here without proof. 


11.14. The Fundamental Theorem of Algebra. Let p € C[z] be a monic polynomial 
of degree n > 1. There exist pairwise distinct complex numbers 1r1,...,7% (unique up to 
reordering) and unique positive integers n1,...,m% such that 


p= (2-971) (2 — 72)" ++ (2 — rR)”. 
The number r; is called a root of p of multiplicity nj. 


The following variant of the Fundamental Theorem is more convenient for partial fraction 
problems because it allows us to use the Negative Binomial Theorem (see §5.3). 


11.15. Theorem: Factorization of Polynomials in C[z]. Let p € C[z] be a polynomial 
of degree n > 1 with p(0) = 1. There exist pairwise distinct, nonzero complex numbers 
T1,---,T~ and positive integers n1,..., np such that 


p(z) = (L—=riz)™ (1 — raz) +++ (1 —rpz)”*. 


Proof. Write p = S79 piz’ with po = 1, and consider the polynomial q = 2"p(1/z) = 
eo Pn-i2". Intuitively, q is obtained from p by reversing the coefficient sequence. Since 
po = 1, g is a monic polynomial of degree n. Using the Fundamental Theorem of Algebra, 
we write 


k 
2"p(1/z) = q(z) = IIe —7i)™, 


where aia n; =n. Since the constant term of q is nonzero, no r; is equal to zero. Reversing 
the coefficient sequence again, it follows that 


“ - 1l—r,z\™ s 
p(z) = z"q(1/z) = 2" |] (G/z) = rj) = Tl ( i ) = Ifa = rz). oO 


i=l i=1 4=1 


The next step is to rewrite a general fraction f/g as a sum of fractions whose denom- 
inators have the form (1 — rz)’. Note that we can always arrange that g has constant 
coefficient 1 (assuming g(0) 4 0 initially) by multiplying the numerator and denominator 


of f/g by 1/g9(0). 


11.16. Theorem: Splitting a Denominator. Suppose f,g € C[z] are polynomials such 
that g(0) = 1, and let g have factorization g(z) = ig ear — rz)", where r1,...,7, € C are 
distinct and nonzero. There exist polynomials po, pi,...,px with deg(p;) < n; or p; = 0 for 
i between 1 and k, such that 


f Di 
— =Ppo+ i. 
g 2, (1 — riz)” 
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Proof. For i between 1 and k, define a polynomial hj = g/(1 — riz)” = []j.54;(1 — ri2z)™ 
Since r1,...,1r, are distinct, gcd(hi,..., hx) = 1. By a known result from polynomial alge- 
bra, it follows that there exist polynomials a,,...,a, € C[z] with ajhi + +--+ azh, = 1. 
Therefore, 


f f-l _ _ farhs i farhr 
g g 


k 
2 fai 
7 Gar , 


This is almost the answer we seek, but the degrees of the numerators may be too high. 
Using polynomial division we can write fa; = qi(1 — riz)" +p; where q;,p; € C[z], and 
either p; = 0 or deg(p;) < n;. Dividing by (1 — r;z)™, we see that 


f “ Di 
“=pjt — ee 
g » (1 — riz)” 


holds if we take po = Se a SCZ. Oo 


The fractions p;/(1 — riz)" (with deg(p;) <n; or p; = 0) can be further reduced into 
sums of fractions where the numerators are complex constants. 


11.17. Theorem: Division by (1 — rz)”. Given a fraction p/(1 — rz)” where p € C{z], 
deg(p) <n or p=0, and0#,r €C, there exist complex numbers ay1,...,@,, such that 


a0 “loos (ra) 


Proof. This proof uses some facts about evaluation homomorphisms for polynomial rings 
from Exercise 11-42. Consider the evaluation homomorphism E : C[z] — C[z] such that 
E(z) =1-rz. The evaluation homomorphism E* : C[z] + C[z] such that E*(z) = (1—z)/r 
is a two-sided inverse to E (since idcj,;, Ho E*, and E*o E are all C-algebra homomorphisms 
sending z to z, forcing id = Ho E* = E*oF by uniqueness), so F is a bijection. In particular, 
E is surjective, so p = E(q) for some g € C[z]. Now, one may check that E and E* each 
map polynomials of degree less than n to polynomials of degree less than n, and it follows 
that deg(q) < n or g=0. Write q= co t+ cz 4+ coz ++++ +€n_12"1, with c; € C. Then 


p= E(q) =e +e(1—rz) +eo(1 —1z)? +--+ +en,-1(1—rz)"". 
Dividing by (1 — rz)”, we see that we may take a, = Cn—1,.--,@n—1 = C1, Gn = Co. O 


The next result summarizes the partial fraction manipulations in the last two theorems. 
The uniqueness proof given below also provides an algorithm for finding the coefficients in 
the partial fraction decomposition. 


11.18. Theorem: Partial Fraction Decompositions. Suppose f,g € C[z] are polyno- 
mials with g(0) = 1; let g = a —1r;,z)" where the r; are distinct nonzero complex 
numbers. There exist a unique polynomial h € C[z] and unique complex numbers a;; (for 
i,j in the range 1 <i<kand1l<j<n,) with 


>>I aps (11.5) 


= Si 
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The coefficient of z” in the power series expansion of f/g is 


a a = n+] dL 
Bile = hen +32 Yo au( aj —1 jer. 


i=1 j=l 


If we view z as a complex variable (rather than a formal indeterminate), the series for f/g 
in (11.5) converges for all z € C such that |z| < min{1/|r;|:1<i<k}. 


Proof. Existence of the decomposition (11.5) follows by combining Theorems 11.16 
and 11.17. The formula for the coefficient of z” follows from the Negative Binomial Theo- 
rem 5.13. We must still prove uniqueness of h and the a;;. Note first that the numbers r; 
and n; appearing in the factorization of g are unique because of the uniqueness assertion 
in the Fundamental Theorem of Algebra. Now consider any expression of the form (11.5). 
Multiplying both sides by g produces an equation 


kon 
f= gh +S 2S ais (1 — rz)" [Ic —1T.z)"*, (11.6) 


i=1 j=1 sHi 


where both sides are polynomials. Furthermore, the terms in the double sum add up to a 
polynomial that is either zero or has degree less than deg(g). Thus h must be the quotient 
when f is divided by g using the polynomial division algorithm, and this quotient is known 
to be unique. Next, we show how to recover the top coefficients a;,,, for 1 <i <k. Fix 1, 
and evaluate the polynomials on each side of (11.6) at z = 1/r;. Since any positive power 
of (1 — r;z) becomes zero for this choice of z, all but one term on the right side becomes 


zero. We are left with 
fA) =— en ny 
sft 

Since r,; #7; for s Ai, the product is nonzero. Thus there is a unique a;,,, € C for which 
this equation holds. We can use the displayed formula to calculate each a;,,, given f and g. 

To find the remaining a;;, subtract the recovered summands aj, /(1—1iz)”* from both 
sides of (11.5) (thus replacing f/g by a new fraction f;/g1) to obtain a new problem in which 
all n; have been reduced by 1. We now repeat the procedure of the previous paragraph to 
find a; n,-1 for all 2 such that n; > 1. Continuing similarly, we eventually recover all the 
aij. This process is illustrated in the examples below. O 


11.19. Example. Let us find the partial fraction expansion of 
f_ 22-2 
g 1—2z2—224+223° 
To find the required factorization of the denominator, we first reverse the coefficient sequence 


to obtain 2° — 22 — z+ 2. This polynomial factors as (z — 2)(z — 1)(z+ 1), so the original 
denominator can be rewritten as 


1 — 22 — 27 +277 = (1— 22)\(1 —z)(1+ 2) 
(see the proof of Theorem 11.15). We know that 


22-2 A B Cc 
pe ing 
1— 22 — 22 +223 [09 “lax” ae ( ) 


for certain complex constants A, B,C. To find A, multiply both sides by 1 — 2z to get 


27-2 _ Bil-2z) C(1—-2z) 
(ee ell 1-z l+z— 
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Now set z = 1/2 to see that A = (—7/4)/(3/4) = —7/3. Similarly, 


27-2 

B= Gat oe 
27-2 

C= G=20=7I,...7 1% 


Expanding (11.7) into a power series, we obtain 


27-2 al oe i 1 
oe 2am 42 2. (-1)"| 2”. 
1 — 2z — g2 + 228 >| 3 +5 6 aie 


11.20. Example. We find the partial fraction expansion of 


f 1 
g 1-—92+3022 — 4623 + 3324 — 925° 


Factoring the denominator as in the last example, we find that g(z) = (1 — z)3(1 — 3z)?. 
We can therefore write 
f A B C D E 


=—— + 


g G28" @-2? 1-2" 0-822 1-82 (11.8) 


To find A, multiply both sides by (1 — z)? and then substitute z = 1 to get A = 1/(—2)? = 
1/4. Similarly, multiplying by (1 — 3z)? and setting z = 1/3 reveals that D = 1/(2/3)° = 
27/8. Having found A and D, we subtract A/(1 — z)? and D/(1 — 3z)? from both sides of 
(11.8). After simplifying, we are left with 

(3/8)(3z — 7) B C E 


(l—2)—32) (=a? 1-2" 1-32" 


Now we repeat the process. Multiplying by (1 — z)? and setting z = 1 shows that B = 3/4. 
Similarly, £ = —81/16. Subtracting these terms from both sides leaves (27/16)/(1 — z), so 
C = 27/16. Using (11.8) and the Negative Binomial Theorem, we conclude that 


f fl f(n4+2\ 3fntl\ 27) 27 (/n +1), 81 gn] on 
e724 @ Pa a et el a ae | 


n=0 


11.5 Generating Functions for Recursively Defined Sequences 


In §5.5, we gave examples of the generating function method for solving recursions. This 
section proves a general theorem about the generating function for a recursively defined 
sequence. 


11.21. Theorem: Recursions with Constant Coefficients. Suppose we are given the 
following data: a positive integer k, complex constants c1,C2,...,Ck,do,-.-,@k—-1, and a 
function g : Z>,% — C. The recursion 


Qn = C1An—1 + CoGn—2 +++ + ChGn—-k+g(n) foralln>k (11.9) 


with initial conditions a; = d; for 0 <7 < k has a unique solution. Defining F = paar, Ganz”, 
di, = d; — c1dj-1 — c2d;-2 — ++ — ido, G = ee diz" + pk g(n)z", and P=1—c2z—- 
C227 — +++ — cp, z*, we have F = G/P. 
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Proof. The existence and uniqueness of the sequence (a, : n > 0) satisfying the given 
recursion and initial conditions is intuitively plausible and can be informally established by 
an induction argument. (A formal proof requires the Recursion Theorem from set theory; 
see Section 12 of [59] for a discussion of this theorem.) It follows that the formal power 
series F' in the theorem statement is well-defined. Consider next the formal power series 


H = (1-—az—@27—--- cyz*)F = PF. 


For each n > k, the recursion shows that G and H have the same coefficient of z”, namely 
g{n) = Gq — Crdp—1 — CaGn—2 = * +? — Chup: 


On the other hand, for 0 < n < k, the initial conditions show that the coefficient of z” in 
both G and 4 is d},. So G = H, and the formula for F' follows by dividing the equation 
G = PF by the formal power series P, which has a multiplicative inverse since P(0) 40. O 


We can now justify Method 2.70 for solving a homogeneous recursion with constant 
coefficients. In this case, the function g(n) in (11.9) is identically zero. So the generating 
function F(z) = )>°°., @nz” for the solution to the recursion has the form 


G(z) 


1 — ez — C922 — +--+ — egzk’ 


F(z) = 


where G(z) is a polynomial of degree less than k. Recall that the characteristic polynomial of 
the given recursion is y(z) = z*—c,2*-1—cgz*-? cy. Factoring y(z) as [][#_, (z—ri)”™, 
we know from the proof of Theorem 11.15 that the denominator P in the formula for F'(z) 


factors as i. —1;,z)'™. By Theorem 11.18 on Partial Fraction Decompositions, we 


conclude that 
a b; j P - 
on = dots (WTI a 


i=1 j=1 


for some complex constants b;,;. This means that the sequence (a, : n > 0) solving the 
recursion is some linear combination of the basic sequences (r?'(n + 8){s/s!:n > 0), where 
1l<i<kand0<s<mj,. 

Conversely, since (1 — r;z)" is a factor of P, one may check that each of these basic 
sequences does solve the recursion in (11.9) for an appropriate choice of initial conditions. 
To obtain the basic solutions mentioned in 2.70, note that (n+s)|./s! is a polynomial in n of 
degree s. The collection of all such polynomials for 0 < s < m, is thus a basis for the vector 
space of polynomials in n of degree less than m;. We know that (n* :0 < s < m;) is alsoa 
basis for this vector space. Therefore, each sequence of the form (r?\(n+ s){5/s!:n > 0) can 
be expressed as a linear combination of sequences of the form (r?’n‘ : n > 0) and vice versa, 
where 0 < s,t < m;. In conclusion, when g = 0, every solution to the recursion in (11.9) isa 
linear combination of the basic solutions (r?’'n' : n > 0), where 7; is a root of the recursion’s 
characteristic polynomial of multiplicity m; and 0 <t < mj. 


DT 


11.6 Formal Composition and Derivative Rules 


In analysis, the composition of two functions f: X + Y and g: Y —> Z is a new function 
gof:X — Z given by (go f)(x) = g(f(x)) for all a € X. This section studies a formal 
version of composition involving formal power series. 
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11.22. Definition: Composition of Formal Series. Suppose F,G € K|[z]] are formal 
power series such that G has constant term zero. Writing F = ju 9 nz", we define the 
composition F'o G [also denoted F(G) or F(G(z))] by FoG = 3 9 anG”. 


Since G has constant term zero, the infinite sum of formal power series ear anG” 
does converge, since the order of the nonzero summands a,,G” is tending to infinity (see 
Theorem 11.7). Intuitively, we obtain F(G(z)) from F(z) by substituting the series G(z) 
for the formal variable z. We point out once again that F' is not a function of z, so this 
operation is not a special case of ordinary function composition, although the same symbol 
o is used for both operations. 


11.23. Example. Let F = )7r° 9 anz”. Taking G to be z = 04+1z+0z?+---, the definition 
of composition gives F'oG = F. So our earlier notational convention F(z) = F is consistent 
with the current notation F'(G) for formal composition. Similarly, taking G to be the zero 
series, we get F(0) = F00 = ao = F|,0. Thus, F(0) denotes the constant coefficient of F’. 
If F'(0) is zero, we see at once that zo F = F = F'o z. Thus, the formal power series z acts 
as a two-sided identity element for the operation of formal composition (applied to formal 
series with constant term zero). 


Suppose F = >”? 9 an2z” and G = 079 bnz” are cr paeed series with G(0) = 0. 
Given m € Zpo, let F* = rg anz" and Gt = Oe . One readily checks that 
eet G)|.m = (F*oG*)|,m, since the discarded powers of z a ean to the coefficient 
of z” in the composition. This shows that any particular coefficient of FoG can be computed 
from a finite amount of data with finitely many algebraic operations. 

The next theorem lists some algebraic properties of formal composition. 


11.24. Theorem: Formal Composition. Let F,G,H, F,,G, € K|[z]] (for & € Zso) 
satisfy G(0) = G,(0) = 0, and let ce K. 
(a) Continuity of Composition: If (Fj,) + F and (G,) > G, then (Fy 0 G,) > FoG. 
(b) Homomorphism Properties: (F + H)oG=(FoG)+(H0G), 
(F-H)oG=(FoG)-(HoG), andcoG=c. 
(c) Associativity: If H(0) = 0, then (FoG)o H = Fo(Gof8#). 


Proof. To prove (a), assume (Fy) + F and (G;) > G. Fix m € Zso, and choose ko large 
enough so that for all k > ko and all n € {0,1,...,m}, Frlen = Flen and Gx|zn = G]zn. Let 
FX, F*, Gi, G* be the series obtained from F;,, F', Gz, and G (respectively) by discarding 
all powers of z beyond z’”. Our choice of ky shows that Ff = F* and GZ = G® for all 
k > ko. By the remarks preceding the theorem statement, we see that 


(Fko Gx)lam = (FP OG) |am = (Ft OG")|am =(FoOG)|ym for all b> ko. 


Since m was arbitrary, it follows that (Fy, 0 G;,) > Fo G, as needed. 

We ask the reader to prove (b) as an exercise. Parts (a) and (b) can be restated as follows. 
For fixed G with G(0) = 0, define a map Re: A{[z]] — K|[z]] (called right composition 
by G or evaluation at G) by Re(F) = FoG for F € K|[z]]. Part (b) says that Re is 
a K-algebra homomorphism. Part (a) says, among other things, that this homorphism is 
continuous: whenever (F},) > F in K[[z]], Re(Fi.) > Re(F). Note that Re(z) = zoG=G. 
In fact, one may check that Re is the unique continuous K-algebra homomorphism of K |[z]] 
sending z to G (Exercise 11-48). 

We use these remarks to prove (c). Assuming G(0) = H(0) = 0, we also have (GoH)(0) = 
0, so that Fo (Go #) is defined. Now F 0 (Go H) = Reon (F), whereas (Fo G)o H = 
Rua(Re(F)) = (Ra o Ra)(F). Here Ry o Re denotes ordinary function composition of 
the functions Ry and Ra. It is routine to check that the composition of two continuous 
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K-algebra homomorphisms on K’|[z]] is also a continuous homomorphism. So, on one hand, 
Ry © Ra is a continuous homomorphism sending z to Ry o Re(z) = (zoG)oH=God. 
On the other hand, Reo is also a continuous homomorphism sending z to Go H. By the 
uniqueness assertion in the last paragraph, the functions Ryo Rg and Reo must be equal. 
So applying these functions to F’ must give equal results, which proves (c). oO 


Recall that the formal derivative of a formal power series F' = yy d,z" is the formal 
power series F’! = 4 F(z) = 072, nanz"! = (mt Lamaiz™. The next theorem 
summarizes some rules for computing formal derivatives, including a formal version of the 


Chain Rule. 


11.25. Theorem: Formal Derivatives. Let F',G € K|[z]] be formal power series. 
(a) The Sum Rule: (F + G)! = F’+ G’. 
(b) The Scalar Rule: For all c € K, (cF)! = c(E"). 
(c) The Product Rule: (F'- G)! = (F’).G+4+ F.- (G"). 
(d) The Power Rule: For all n € Zso, (F")’ =nF"7!.- F’. 
(ec) The Chain Rule: If G(0) = 0, then (Fo G)! = (F’ 0 G) -G’. In other words, 
d 
dz 
(f) Continuity of Differentiation: If (Fi,) + F, then (Fi) > F’. 
Proof. We sketch the proofs of (a) through (e). Write F = 0° 9 anz" and G = YO? 4 bnz”. 
We verify (a), (b), and (c) by comparing the coefficients of z” on both sides of each rule. 
For (a), we get (n+ 1)(@n41+bn+1); for (b), we get c(n+1)an41; and for (c), the coefficient 
is 


[F(G(z))] = F'(G())- (2). 


n+1 n+1 


(n+1) y Akbn4i—-k = Sok +(n+1—k))apbn4i—x 
k=0 k=0 
n+1 n 
= S- kagbn4i-k + Son +1—k)agbn4i—z 
k=1 k=0 
= S00 + Vajribn_j + D5 a(n — kh + 1)bn—e4s. 
j=0 k=0 


Now (d) follows from (c) by induction on n. 

For (e), fix G with G(0) = 0, and define functions p, g : K[[z]] > K[[z]] by p(F) = (FoG)’ 
and q(F’) = (F’0G)-G" for F € K|[z]]. Using earlier results on composition, products, and 
derivatives, one readily verifies that both p and q are K-linear and continuous. Furthermore, 
for any n € Zo, we compute 


pz”) = (2" 0 GY = (G") =nG""" .G’ and g(z”) = (nz™"10G)-G =nG"!.G. 


Since p(z") = q(z”) for all n, it follows from Exercise 11-51 that p = q as functions, so that 
p(F) = q(F) for all F € K[[z]]. This proves (e). O 


DT 


11.7 Formal Exponentials and Logarithms 


The exponential and logarithm functions play a central role in calculus. This section intro- 
duces formal versions of these functions that satisfy many of the same properties as their 
analytic counterparts. 
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11.26. Definition: Formal Exponentials. Let exp(z) = e? = 07°.) 2"/n!. For any for- 
mal power series G € K[[z]] with constant term zero, define exp(G) = e@) = 7°, G"/nl, 
which is the composition of e* with G(z). 


The next theorem lists some fundamental properties of the formal exponential function. 


11.27. Theorem: Formal Exponentials. Let F, G, and Gy (for k € Zso) be formal 
power ne with constant term zero. 

(a) £ exp(z) = exp(z), exp(0) = 1, and  exp(F(z)) = exp(F)F'(2). 

(b) ScolF ) £0, and exp(—F’) = 1/ exp(F). 

(c) exp(F + G) = exp(F) exp(G). 

(d) exp is continuous: if (Ge) — G then (exp(Ge)) — exp(G). 

(e) For all N € Zyo, exp(op_ 1Gr)= le , exp(Gx). 

(f) If 3072, Gy converges, then []7_, exp(Gi,) converges to exp(S>~_, Gr). 

Proof. (a) By definition of formal derivatives, 4 exp(z) = 07, le = a — = 


m 


So St = exp(z). Next, exp(0) = 1+ 072, 0"/n! = 1. The formula for the derivative of 
exp(F'(z)) now follows from the Formal Chain Rule. 

(b) Consider the formal series H(z) = exp(F'(z)) exp(—F(z)). By the Formal Product 
Rule, the Formal Chain Rule, and (a), we compute 


H(z) = exp(F(z))[exp(—F(z)) - (—F"(z))] + lexp(F(2)) F"(z)] exp(—F(2)) = 0. 


So H(z) must be a constant formal power series. Since H(0) = exp(F(0)) exp(—F(0)) = 
exp(0) exp(—0) = 1-1 = 1, A is the constant 1. This means that 1 = exp(F’) exp(—F), so 
that exp(—F’) is the multiplicative inverse of F’ in K[[z]]. We also see that exp(F’) cannot 
be zero, since 1 #0 in K. In fact, one readily checks that exp(F’) has constant term 1. 

(c) Consider the formal series P = exp(F' + G) exp(—F’) exp(—G). The formal derivative 
of P is 


P! =(F' + G')eP+Ge-Fe-G — FleF +Ge-F e-G _ GleF +Ge-Fe-G — 0), 


so P is constant. Evaluating at z = 0, we find that the constant is P(0) = e°F°e~°e~® = 1, 
so P = 1. Using (b), it follows that exp(F + G) = Pexp(F) exp(G) = exp(F’) exp(G), as 
needed. 

Part (d) follows from the continuity of formal composition. Part (e) follows from (c) by 
induction on N. Part (f) follows from (d) and (e) by a formal limiting argument. We ask 
the reader to give the details of these proofs in an exercise. O 


Next we define a formal version of the natural logarithm function. 


11.28. Definition: Formal Logarithms. Define a formal power series 


CO yew? 


L(z) = log(1 + z) =o = 2— 27/24 29/3 - 24/44 


For any formal power series G € K|[{z]] with constant term 0, define log(1 + G) = 
yr (-1)"" 1G" /n, which is the composition Lo G. For any formal power series H with 


constant term 1, define log(H) = S77°,(-1)""!(H — 1)"/n, which is the composition 
Lo(H—1). 


11.29. Theorem: Formal Logarithms. Let F, G, H, and H;, (for k € Zso) be formal 
power series with F'(0) = 0 and G(0) = H(0) = H;,(0) = 1. 
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(a) 4 log(1 + z) = (1+ 2z)~1, log(1) = 0, # log(1 + F(z)) = F’(z)/(1+ F(z)), and 
4 log(G(z)) = G'(z)/G(z). 

(b) log(exp(F')) = F' and exp(log(H)) = H. 

(c) log(GH) = log(G) + log(#). 

(d) log is continuous: if (H,) > H, then (log(H;)) > log(#). 

(c) For all N € Zso, log([]p_, Hz) = pL, log(Hy). 

(f) If [] 72, He converges, then S>?°., log(Hj;,) converges to log([]7.., Hx). 


Proof. (a) By the definition of formal derivatives and the Formal Geometric Series formula, 


og +2) = rayne = 0-2 
n=1 m=0 


Taking G = 0 in the definition gives log(1) = log(1 + 0) = >°~_,(—1)""'0"/n = 0. The 
formula for the derivative of log(1 + F(z)) follows from the Formal Chain Rule, and the 
formula for _ log(G(z)) follows by taking F' to be G — 1. 

(b) Note that exp(Ff) has constant term 1, so log(exp(F’)) is defined. Let 
Q(z) = log(exp(F(z))) — F(z). Taking the formal derivative of Q, we get Q’(z) = 
F'(z) exp(F(z))/exp(F(z)) — F’(z) = 0, so Q is a constant. Since the constant term of 
Q is zero, Q is the zero power series. Thus, log(exp(F’)) = F as needed. A similar proof 
shows that exp(log(H)) = H. 

(c) Consider the formal series P = log(GH) — log(G) — log(H). Computing derivatives, 


G'(z)H(z) +G(z) H(z) Gz) H"(z) 
G(z) H(z) G(z) H(z) 


So P is constant. As P has constant term zero, P = 0, proving (c). 

Now (d) follows from the continuity of formal composition, (e) follows from (c) by 
induction on N, and (f) follows from (d) and (e) by a formal limiting argument (Exercise 11- 
62). oO 


DS 


11.8 The Exponential Formula 


Many combinatorial structures can be decomposed into disjoint unions of smaller structures 
that are connected in some sense. For example, set partitions consist of a collection of 
disjoint blocks; permutations can be regarded as a collection of disjoint cycles; and graphs 
are disjoint unions of connected graphs. The Exponential Formula allows us to compute the 
generating function for such structures from the generating functions for their connected 
components. This formula reveals the combinatorial significance of the exponential of a 
formal power series. 

First we need to review set partitions and ordered set partitions. Recall from §2.12 that 
a set partition of a set X is a set P = {B,, Bo,..., Bm} of nonempty subsets of X such 
that every a € X belongs to exactly one block B; of P. Since P is a set, the blocks in a set 
partition can be presented in any order; for example, {{1, 2, 4}, {3,5}} and {{3, 5}, {1, 2, 4}} 
are equal set partitions of the set {1, 2,3, 4,5}. In contrast, an ordered set partition of X isa 
sequence Q = (B,, Bo,..., By) of distinct sets such that {B,, Bo,..., By} is a set partition 
of X. Here the order of the blocks in the list Q is important, but we can still list the elements 
within each block in any order. For example, ({1, 2,4}, {3,5}) and ({3, 5}, {1, 2,4}) are two 
different ordered set partitions of {1,2,3,4,5}. 
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Let SetPar(n,m) be the set of all set partitions of {1,2,...,n} consisting of m blocks, 
and let OrdPar(n,m) be the set of all ordered set partitions of {1,2,...,n} consisting 
of m blocks. We know that |SetPar(n,m)| = S(n,m), the Stirling number of the second 
kind. Define a map f : OrdPar(n,m) — SetPar(n,m) by sending Q = (Bi,...,Bm) € 
OrdPar(n,m) to f(Q) = {Bi,...,Bm}. This map forgets the ordering of the blocks in an 
ordered set partition to produce an ordinary set partition. We see that f is onto but not 
one-to-one. More precisely, since we can order a set of m distinct blocks in m! ways, we 
see that for each P € SetPar(n,m), there are exactly m! objects Q € OrdPar(n,m) with 
f(Q) = P. Consequently, | OrdPar(n,m)| = m!S'\(n,m). 

We also need the following encoding of ordered set partitions. Let W(n,m) be the set 
of all lists (k1,...,km,w), where k,,...,km are positive integers with ky +---+ kn =n, 
and w is an anagram in R(1*!---m*~) (so for j between 1 and m, the letter 7 appears 
k; times in w). We define a bijection g : W(n,m) — OrdPar(n,m) as follows. Given 
(k1,.--,km,w) € W(n,m), g maps this object to the ordered set partition (Bi,..., Bm) 
such that B; = {4: w; = j} for 1 <j < m. Note that |B;| =k, for 1 < 7 < m. The inverse 
of g sends (Bi,...,Bm) € OrdPar(n,m) to the list (|By|,...,|Bm|,w), where w; = 7 iff 
i€ B; forl<i<nand1l<j<m. 

We are now ready to state the Exponential Formula. Compare the next theorem to the 
EGF Product Rule in §5.12. 


11.30. The Exponential Formula. Let C and S be sets such that each object x in C’ 
or S has a weight wt(x) and a size sz(a), where sz(c) > 0 for all c € C. (Intuitively, C is a 
set of connected structures of various positive sizes and S is a set of objects that are labeled 
disjoint unions of these structures.) Suppose that every object s € S of size n > 0 can be 
constructed uniquely by the following process. First, choose a set partition P of {1,2,...,n}. 
Next, for each block B in the set partition P, choose a connected structure c(B) from C such 
that sz(c(B)) = |B]. Finally, assemble these choices in a prescribed manner to produce s. 
Assume that whenever s is built from P and (c(B) : B € P), the weight-additivity condition 
wt(s) = >) pep wt(c(B)) holds. Define generating functions 


sz(c) 782(s) 
Go = Verto 2 d Ge= yet) 2. 
. »X sz(c)! = . d sz(s)! 


Then Gg = exp(Gc). 


Proof. For each n > 0, define S, = {s € S : sa(s) = n}, Crh = {c € C: sa(c) = nh, 
GF(Sn) = seg, 1), and GF(Cn) = cec, #7. We show that Gg|z. = [exp(Gc)] 
by computing both sides. The constant term on each side is 1, so fix n > 0 from now on. 
First, Gg|2» = GF(S,,)/n!. To compute this more explicitly, we use the description of 
objects in S;, in the theorem statement. For each P € SetPar(n,m), let S'p be the set of 
s € S,, that are built by choosing P at the first stage. By the Sum Rule for Weighted Sets, 


gn 


Gr(s,)= 3° Ss" GF(Sp). 


m=1 PESetPar(n,m) 


Now fix m € {1,2,...,n}, and fix a set partition P € SetPar(n,m). Write P = 


{B,, Bo,..., Bm}, where we choose the indexing so that min(B,) < min(Bg) < --: < 
min(B,,), and let k; = |B;| for 1 < i < m. By assumption, we can build each ob- 
ject s € Sp by choosing c(Bi) € Cpr,, c(B2) © Ch, ..., Bm) © Ck,,, and assem- 


bling these choices in a prescribed manner. By the Product Rule for Weighted Sets, 
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GF(Sp) = GF(C,, ) GF(C,, ) ae GF(Cy,,, ) = Ilgep GF(C\g)). Thus, 


Gs|in = ~ GF(S,) = + S> S| If GF). (11.10) 


“m=1 P€SetPar(n,m) BEP 


Next, we find the coefficient of z” in exp(Gc). By Definition 11.26, 


ese) = Yo Gay" y* 1 (5 SECs) 


m=0 m=0 


Using Theorem 5.35, we find that 


| GF(C,, ) GF(C,, ) tee GF(Cx,,, ) 
[exp(Ge)]|zn = y = > ~— ithelenskegk 
m=1" (ki,ko,...,Km)€Z™): 


ki thet: +tkm=n 


(compare to (11.2)). Now multiply and divide each term by n! to make a multinomial 
coefficient appear. We find that [exp(Gc)]|2» equals 


ea n 
— — GF (Cz, ) GF (Cx, ) +++ GF (Cz, ). 
ae = ae (Cnn) GE(Cra) (Cem) 
m=1 (k1,k2,...,km)EZZ: 
kithot+---+hm=n 
By the Anagram Rule, 


n 
ee) a bs 1. 


WER(1*12%2-.-mkm) 


This observation turns the inner sum into a sum indexed by objects (ky,...,km,w) in 
W (n,m): 
rol 
lexp(Go)]l-n = 5 y = as GF(C,,) GF(Ge, )>«* GE(Cy,, ). 


m=1  ~ (k1,...,km,w)€W (n,m) 


We use the bijection g to convert to a sum indexed by ordered set partitions. Recall that 
for g(ki,...,km,w) = (Bi,...,Bm), we have |B;| =k; for all 7. So we get 


[exp(Gc)]|2" = -. - .> II GF(C\a,\)- 


(By gsesy Bm)€OrdPar(n,m) j=1 


m= 


The final step is to use the function f to change the sum over ordered set partitions to a 
sum over ordinary set partitions. Note that multiplication of generating functions is com- 
mutative. So for each of the m! ordered set partitions (B,,..., Bm) that map to a given set 
partition P € SetPar(n, m), we have JJ", GF(Ciz,)) = [[gep GF(Cja)). This produces an 
extra factor of m! that cancels the division by m! in the previous formula. In conclusion, 


e(Gole=> >> YT cra. 


“m=1 PéSetPar(n,m) BEP 


This agrees with (11.10), so the proof is complete. O 
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11.9 Examples of the Exponential Formula 


This section gives examples illustrating the Exponential Formula. We begin by showing how 
the generating functions for Stirling numbers (derived in §5.13 and §5.14) are consequences 
of this formula. 


11.31. Example: Stirling Numbers of the Second Kind. Let us find the generating 
function for the collection S of all set partitions of one of the sets {1,2,...,n} for some 
n > 0. For a set partition P € SetPar(n,m) C S, we define sz(P) = n and wt(P) = m, 
the number of blocks of P. In this case, the connected pieces used to build a set partition 
P are the individual blocks of P. Each connected piece has no additional structure beyond 
the set of labels appearing in the block. We model this by letting C = Zso = {1,2,3,...}, 
and setting sz(c) = c and wtc(c) = 1 for all c € C. To build a typical object in S of size 
n, we first choose a set partition P = {B,...,Bm} of {1,2,...,n}. For 1 < i < m, we 
then choose c(B;) = |B;|, which is the unique element in C of size |B;|. The final object 
constructed from these choices is P itself. The weight-additivity condition holds because 
wt(P) =m= 0,1 =>02, wtc(c(B;)). By the definition of C, 


sz(c) od 
a pwt(c) z = oo =f 
oe Dy sz(c)! a ian ee 


cEC 4=1 


Applying the Exponential Formula, we get 


ies 2 S(n, m)t = = Gs = exp(Gc) =e", 
n=1m=1 
in agreement with Theorem 5.43. Recall that the Bell number B(n) is the number of set 
partitions of {1,2,...,n} with any number of blocks. Setting t = 1 in the previous generating 
function, we get the EGF for Bell numbers: 


ae z 
x B (rn) =e? —}, 
n=0 

This calculation can be adjusted to count set partitions with restrictions on the allowable 
block sizes. For example, suppose we are counting set partitions that contain no blocks 
of size 1 or 3. We modify Go by making the coefficients of z! and z° zero, giving Go = 
t(e? —1—z—23/3!). Then Gs = exp(Gc) = exp(t(e* — 1 — z — z3/3!)). Setting ¢ = 1 and 
extracting the coefficient of z!*, we find (using a computer algebra system) that the number 
of such set partitions of {1,2,...,12} is 159,457. 


11.32. Example: Stirling Numbers of the First Kind. Let S be the set of all permu- 
tations of one of the sets {1,2,...,n} for some n > 0. For w € S permuting {1,2,...,n}, 
let sz(w) =n, and let wt(w) be the number of cycles in the digraph of w (see §3.6). Recall 
that the signless Stirling number of the first kind, denoted s’(n,k), is the number of objects 
in S of size n and weight k. We use the Exponential Formula to find the generating function 
for these numbers. 

In this case, we assemble permutations w € S' from connected pieces that are the indi- 
vidual directed cycles in the digraph of w. To model this, let C’ be the set of all k-cycles on 
{1,2,...,k}, as k ranges through positive integers. Given a k-cycle c € C, define wt(c) = 1 
and sz(c) = k. For each k > 0, C' contains i — 1)! objects of size k. Therefore 


sae) OO ok 
Go = yom = = =t)> = = -tlog(1 — z) = log((1 — z)~4, 


cEC ! k= k=1 
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where the last step uses Exercise 11-64(e). To check the hypothesis of the Exponential 
Formula, note that each permutation s € S of size n can be built uniquely as follows. Choose 
a set partition P = {B,,..., By} of {1,2,...,n}. For each k-element block B of P, choose 
a k-cycle c(B) € C. Suppose B = {i, < ig < +--+ < ix}. Replace the numbers 1,2,...,k 
appearing in the k-cycle c(B) by the numbers 71, i2,..., 7%, respectively. We obtain one of the 
cycles in the digraph of s. After all choices have been made, we have built the full digraph of 
s, which uniquely determines s. For example, suppose n = 8, P = {{2,4,7,8}, {1,3, 5}, {6}}, 
c({2, 4,7, 8}) = (1,3, 4, 2), c({1, 3, 5}) = (1,3, 2), and c({6}) = (1). These choices create the 
permutation s = (2,7,8,4)(1,5,3)(6). The Exponential Formula tells us that 


14° Yo (n,m) = Gs = exp(Ge) = exp(log((1 — 2)~4)) = (1-2) 


n=1lm=1 


in agreement with (5.11). 

To get a generating function for derangements (permutations with no 1-cycles), we 
modify Go by making the coefficient of zt be zero. This gives Go = log|(1 — z)~‘] — tz, 
so Gs = exp(Gc) = e ‘(1 — z)~*. (Compare to the generating function found in §5.7.) 
Extracting the coefficient of t°z!"/12!, we find that there are 866,250 derangements of 
{1,2,...,12} consisting of five cycles. 


The Exponential Formula tells us that G's = exp(Gc) when structures in S are built 
from labeled disjoint unions of connected structures in C’. Sometimes we know the generating 
function Gg and need to know the generating function Gc. Solving for Go, we get Go = 
log(Ggs). The next example uses this formula to obtain information about connected graphs. 


11.33. Example: Connected Components of Graphs. Let C be the set of all con- 
nected graphs on one of the vertex sets {1,2,...,k}, with k > 0. Given a k-vertex graph 
c € C, let wt(c) = 0 and sz(c) = k. Direct computation of the generating function Gc is 
difficult. On the other hand, consider the set S of objects we can build from C' by the proce- 
dure in the Exponential Formula. Suppose we choose a set partition P = {B,, Bz,..., Bm} 
of {1,2,...,n} for some n > 0, then choose a connected graph c(B;) € C of size |B;| for 
1 <%<m. We can assemble these choices to get an arbitrary (simple, undirected) graph 
with vertex set {1,2,...,n} by relabeling the vertices 1,2,...,|B;| in each graph c(B;) with 
the labels in B; in increasing order. Thus S' consists of all graphs on one of the vertex sets 
{1,2,...,n}, where the size of the graph is the number of vertices and the weight of the 
graph is zero. By the Product Rule, there are 2(2) graphs with vertex set {1,2,...,n}, since 


we can either include or exclude each of the (3) possible edges. Accordingly, 


sz(s) oo (2) en 
= prt(s) ae _ 
ve d sz(s)! dX n! 


By the Exponential Formula, Go = log(Gg). Extracting the coefficient of z"/n! on both 
sides leads to the exact formula 


oe ee el CT cama (11.11) 


m=1 (k1,...jkm)EZ™): 
kite-tkm=n 


for the number of connected simple graphs on n vertices. When n = 7, we find there are 
1,866,256 such graphs. 
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11.10 Ordered Trees and Terms 


The last main topic in this chapter is compositional inverses of formal power series. Our 
goal is to develop algebraic and combinatorial formulas for the coefficients in these inverses. 
To prepare for this, we must first study combinatorial structures called ordered trees and 
ordered forests. Ordered trees are defined recursively as follows. 


11.34. Definition: Ordered Trees. The symbol 0 is an ordered tree. If n € Zso and 
T,,..-,;In is a sequence of ordered trees, then the (n + 1)-tuple (n, 71, To,...,T,) is an 
ordered tree. All ordered trees arise by applying these two rules a finite number of times. 


We can visualize ordered trees as follows. The ordered tree 0 is depicted as a single node. 
The ordered tree (n, 7), 72,...,T;) is drawn by putting a single root node at the top of the 
picture with n edges leading down. At the ends of these edges, reading from left to right, 
we recursively draw pictures of the trees T,,7>,..., 7, in this order. The term ordered tree 
emphasizes the fact that the left-to-right order of the children of each node is significant. 
Note that an ordered tree is not a tree in the graph-theoretic sense, and ordered trees are 
not the same as rooted trees. 


11.35. Example. Figure 11.1 illustrates the ordered tree 
T = (4, (2,0, (1, 0)), 0, (3, 0, (3, 0, 0,0), 0), 0). 


FIGURE 11.1 
Diagram of an ordered tree. 


Ordered trees can be used to model algebraic expressions that are built up by applying 
functions to lists of inputs. For example, the tree T in the previous example represents the 
syntactic structure of the following algebraic expression: 


f(9(21, h(x2)), v3, k(r4, j (25, 26, £7), Lg), Fo). 


More specifically, if we replace each function symbol f,g,h,k,j by its arity (number of 
inputs) and replace each variable x; by zero, we obtain 


4(2(0, 1(0)), 0, 3(0, 3(0, 0, 0), 0), 0). 


This string becomes T if we move each left parenthesis to the left of the positive integer 
immediately preceding it and put a comma in its original location. 

Surprisingly, the syntactic structure of such an algebraic expression is uniquely deter- 
mined even if we erase all the parentheses. To prove this statement, we introduce a combi- 
natorial object called a term that is like an ordered tree, but contains no parentheses. For 
example, the algebraic expression above will be modeled by the term 42010030300000. 
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11.36. Definition: Words and Terms. A word is a finite sequence of symbols in Zo. 
We define terms recursively as follows. The word 0 is a term. If n > 0 and 7}, 7,...,Tn 
are terms, then the word n7T,72---T,, is a term. All terms arise by applying these two rules 
a finite number of times. 


We see from this definition that every term is a nonempty word. 


11.37. Definition: Weight of a Word. Given a word w = w w2:::ws, the weight of w 
is wt(w) = wy + wo +--+ +s — 8. 


For example, wt(42010030300000) = 13 — 14 = —1. Note that wt(uw) = wt(v) + wt(w) 
for all words v, w. The next result uses weights to characterize terms. 


11.38. Theorem: Characterization of Terms. A word w = w,w2-:-ws is a term iff 
wt(w) = —1 and wt(wiw2---w,) > 0 for all & € {1,2,...,5— 1}. 


Proof. We use strong induction on the length s of the word. First suppose w = w,w2--:Ws 
is a term of length s. If w = 0, then the weight condition holds. Otherwise, we must have 
w = nT,T>---T, where n > 0 and 7),72,...,7, are terms. Since each T; has length less 
than s, the induction hypothesis shows that wt(Z;) = —1 and every proper prefix of T; has 
nonnegative weight. So, first of all, wt(w) = wt(n)+wt(Z))+---+wt(T,) = (n-1)-—n = -1. 
On the other hand, consider a proper prefix w,w2--:w,z of w. If k = 1, the weight of this 
prefix is n — 1, which is nonnegative since n > 0. If k > 1, we must have wiwe::- wr = 
nT, ---T,z where 0 <i < nand z is a proper prefix of T;, 1. Using the induction hypothesis, 
the weight of w;w2---wx is therefore (n — 1) —1+ wt(z) > (n—i)-12>0. 

For the converse, we also use strong induction on the length of the word. Let w = 
W1W2+-:Ws satisfy the weight conditions. The empty word has weight zero, so s > 0. If 
s = 1, then wt(w,) = —1 forces w = 0, so that w is a term in this case. Now suppose 
s > 1. The first symbol w; must be an integer n > 0, lest the proper prefix w, of w have 
negative weight. Observe that appending one more letter to any word decreases the weight 


by at most 1. Since wt(w;) =n — 1 and wt(w)w2---w,) = —1, there exists a least integer 
ky with wt(wiwe---wr,) = n — 2. Now if n > 2, there exists a least integer ky > ky with 
wt(w1w2:++wr,) = n— 3. We continue similarly, obtaining integers ky < ky <-+-: < kn 


such that k; is the least index following kj; such that wt(wiw2---wr,) = n-1-—i. 
Because w satisfies the weight conditions, we must have k, = s. Now define n subwords 
Ti = W2W3°** Wk; T> = Wk, 4+1Wk,42°°* Wkoy +25 Le, = Wr, 141°" * Wes Evidently Ww = 
nT T> angie Ds For 1 < a < nN, nT T> sess Tj-1 has weight n—t, nT T> eke T; has weight n-t— 1; 
and (by minimality of k;) no proper prefix of nT, T>---T; of length at least kj; has weight 
less than n—1. It follows that T; has weight —1 but every proper prefix of T; has nonnegative 
weight. Thus each T; satisfies the weight conditions and has length less than w. By induction, 
every TJ; is a term. Then w = n7\73---T), is also a term, completing the induction. O 


11.39. Corollary. No proper prefix of a term is a term. 


11.40. Theorem: Unique Readability of Terms. For every term w, there exists a 
unique integer n > 0 and unique terms 7},...,7;, such that w=nT,---Ty. 


Proof. Existence follows from the recursive definition of terms. We prove uniqueness by 
induction on the length of w. Suppose w = nT,---T,, = mT{---T/, where n,m > 0 and 
every T; and T; is a term. We must prove n = m and T; = TY for 1 < i < n. First, 
n= w, =m. If T, # Ti, then one of T; and T] must be a proper prefix of the other, in 
violation of the preceding corollary. So T; = T]. Then if Tz 4 T3, one of Ty and Tj must be 
a proper prefix of the other, in violation of the corollary. Continuing similarly, we see that 
T,; = T! for i= 1,2,...,n. O 
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Using the previous theorem and induction, it can be checked that erasing all parentheses 
defines a bijection from ordered trees to terms. Therefore, to count various collections of 
ordered trees, it suffices to count the corresponding collections of terms. We give examples 
of this technique in the next section. 


DT 


11.11 Ordered Forests and Lists of Terms 


We continue our study of ordered trees and terms by introducing two more general concepts: 
ordered forests and lists of terms. 


11.41. Definition: Ordered Forests. For n € Zso, an ordered forest of n trees is a list 
(T1, T2,...,Tn), where each T; is an ordered tree. 


11.42. Definition: Lists of Terms. For n € Zo, a list of n terms is a word w of the 
form w = T,T>---T,,, where each T; is a term. 


11.43. Theorem: Weight Characterization of Lists of Terms. A word w = 
w1w2:+:ws is a list of n terms iff wt(w) = —n and wt(wiw2---wr) > —n for all 
ke {1,2,...,s—1}. 


Proof. First suppose w is a list of n terms, say w = T,7T>---T,. Then nw = nT,7>---Ty, isa 
single term. This term has weight —1, by Theorem 11.38, so w has weight —1—wt(n) = —n. 
If wt(wi---w,) < —n for some k < s, then the proper prefix nw1--- wz of the term nw 
would have negative weight, contradicting Theorem 11.38. 

Conversely, suppose w satisfies the weight conditions in Theorem 11.43. Then the word 
nw satisfies the weight conditions in Theorem 11.38, as one may verify. So nw is a term, 
which must have the form n7\7>---T,, for certain terms 7),...,7,. Then w = 7,7>---T, 
is a list of n terms. O 


11.44. Theorem: Unique Readability of Lists of Terms. If w = 7, 7>---T), is a list 
of n terms, then n and the terms T; are uniquely determined by w. 


Proof. First, n = — wt(w) is uniquely determined by w. To see that the 7; are unique, add 
an n to the beginning of w and then appeal to Theorem 11.40. O 


We deduce that erasing parentheses gives a bijection between ordered forests of n trees 
and lists of n terms. 
The next lemma reveals a key property that will allow us to count lists of terms. 


11.45. The Cycle Lemma for Lists of Terms. Suppose w = w w2---ws is a word of 
weight —n < 0. There exist exactly n indices 7 € {1,2,...,s} such that the cyclic rotation 


R;(w) = WiWj+1 °° * WsW1W2 °° * Wi-1 
is a list of n terms. 


Proof. Step 1. We prove the result when w itself is a list of n terms. Say w = T17>---Ty 
where T; is a term of length k;. Then R,(w) is a list of n terms for the n indices i € 
{lk +1,ky tho +1,...,h1 + ko +--+ +kn-1 +1}. Suppose i is another index (different 
from those just listed) such that R;(w) is a list of n terms. For some j in the range 1 < j < n, 
we must have 

Ri(w) = yTyi41 oes Delt oss Tj-12 
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where T; = zy and z,y are nonempty words. Since wt(z) > 0 but wt(Z;) = —1, we must 
have wt(y) < 0. So 


we (yTj41---Th—1) = wt(y) + wt(Tj41) +--+ + we(Tj1) <—(n- 1). 


Then yZj+1---Zj—-1 is a proper prefix of Rj(w) with weight < —n, in violation of Theo- 
rem 11.48. 

Step 2. We prove the result for a general word w. It suffices to show that there exists at 
least one 7 such that R;(w) is a list of n terms. For if this holds, since we obtain the same 
collection of words by cyclically shifting w and R;(w), the result follows from Step 1. 

First note that all cyclic rotations of w have weight ae wt(w;) = wt(w) = —n. 
Let m be the minimum weight of any prefix wyw2--: wz of w, where 1 < k < s. Choose k 
minimal such that wt(wiw2--- we) = m. If k = s, then m = —n, and by minimality of k and 
Theorem 11.43, w itself is already a list of n terms. Otherwise, let i = k+1. We claim R;(w) 
is a list of n terms. It suffices to check that each proper prefix of R;(w) has weight > —n. On 
one hand, for all j with i < 7 < s, the prefix w;---w, of R;(w) cannot have negative weight; 
otherwise, wt(w1---w,w;---w;) <_m violates the minimality of m. So wt(w;---w;) >0> 
—n. Note that when j = s, we have wt(w;---ws) = wt(w) — wt(w1--- we) = —n —m. Now 
consider j in the range 1 <j <k. If wt(w;---wswi--:w;) < —n, then 


wt(wi--- wy) = wt(wi---Wswi-+-w;) — wt(wi--- ws) < —n-— (—n-—m) =m. 


But this violates the choice of hk as the least index such that the prefix ending at k has 
minimum weight. So wt(w;---wswi---w,;) > —n. It now follows from Theorem 11.43 that 
R;(w) is indeed a list of n terms. oO 


Suppose w is a list of n terms containing exactly k; occurrences of 7 for each i > 0. We 


have 
<n =wt(w) = So kj wt(é) = 0 — Iki = ko + SO (8 - Dy. 
i>0 i>0 i>1 
It follows that ko = n + 30,.,(i — 1)k; in this situation. Conversely, if ko satisfies this 
relation, then wt(w) = —n for all w € R(0*01*12*2 ...). We now have all the ingredients 
needed for our main counting result. 


11.46. Theorem: Counting Lists of Terms. Let n > 0 and ko, ki,...,k; => 0 be given 
integers such that kp =n + pai —1)k;. The number of words w such that w is a list of 
n terms containing k; copies of i for 0 <i < tis 


n s _ ns—1)! 
$s ko, k1,..., ke ~ kolky) +++ kel? 


where s = ie kj =n + 52\_, ik; is the common length of all such words. 


Proof. Let A be the set of all pairs (w,7), where w € R(0*01" ---t*) is a word and j € 
{1,2,...,s} is an index such that the cyclic rotation R;(w) is a list of n terms. Combining 
Lemma 11.45 and the Anagram Rule, we see that |A| =n(,. 4°. 4,)- 

Let B be the set of all words w € R(0*1*:---¢*) such that w is a list of n terms 
(necessarily of length s). To complete the proof, we show |A| = s|B| by exhibiting mutually 
inverse bijections f : A— Bx{1,2,...,s}andg: Bx{1,2,...,s}— A. We define f(w, 7) = 
(R;(w), 9) for all (w,7) € A, and g(w,i) = (Rj '(w),#) for all (w,i) € Bx {1,2,...,s}. DO 


u 
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11.12 Compositional Inversion 


Earlier in the chapter, we studied multiplicative inverses of formal power series F’ € K|[z]| 
such that F'(0) 4 0. In this section, we study compositional inverses of formal power series 
G € K|[z]] such that G(0) = 0. We find algebraic and combinatorial formulas for the 
coefficients in these inverses. We begin by showing that a certain subset of K’[[z]] is a group 
under the operation of formal composition. 


11.47. Theorem: Group Axioms for Formal Composition. The set 
G={F €K|{z]]: F\|.. =0 and F\,1 £0} 
is a group under the operation o (composition of formal power series). 


Proof. We check the group axioms in Definition 7.1. For closure, fix F,G € G. Write F = 
rg Ganz” and G = Oey bnz”™ where ap = 0 = by and a; # 0 # by. By Definition 11.22, 
FoG = , anG”. Since G(0) = 0, each summand a,G” has constant term zero, so 
(Fo G)(0) = 0. On the other hand, the only summand a,,G” that has a nonzero coefficient 
of z' is a,G!. We see that (Fo G)|,1 = (a1G)|,1 = a1b1, which is nonzero since a; and by 
are nonzero. Thus F'oG is in G, so the closure axiom holds. We have seen in Example 11.23 
that z € G is a two-sided identity element relative to composition: zo F = F = Fo z for 
all F € G. Theorem 11.24(c) shows that o is associative: Fo (Go H) = (F'oG) o H for all 
F,G,HEG. 

We must also verify the inverse axiom: for all F' € G, there exists G € G with GoF = z= 
FoG. Fix F = ys. 1 Inz” € G; we first prove there exists a unique G = pabeale pom2z”™ EG 
solving the equation Go F = z. For each n € Zso, the coefficient of z” in Go F is 


(GoF)|n= (So r) 


We show there is a unique choice of the sequence of scalars (b,;, :m > 1) that makes this 
coefficient equal 1 for n = 1 and 0 otherwise. When n = 1, we need 1 = by F | ,1 = bja,. We 
know a; #0 in K, so this equation is satisfied iff b} = 1/a1, which is a nonzero element of 
the field K. 

Now fix an integer n > 1, and assume we have already found unique 0),...,bn,-1 € K 
making the coefficients of z* in Go F and z agree for all k < n. Since (b,F")|2» = brat, 
we need to choose b,, so that 


(es) 
m=1 


There is a unique b, € K that works, namely 


ale 


m=1 


zn 


n-1 
= baat + & bn) 


m=1 


gn 


(11.12) 


zn 


We have now found a unique G = >> ™ in G such that Go F' = z. We call G the left 
inverse of F in G. 


To finish, we show that Go F = z automatically implies FoG = z. We have shown that 


rae 
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every element of G has a left inverse. So let H € G be the left inverse of G, which satsifies 
HoG=z. Using the associativity and identity axioms, we compute 


H=Hoz=Ho(GoF)=(HoG)oF=z0F=F. 
Since H = F and HoG =z, we conclude F'o G = z, as needed. oO 


Our proof of the inverse axiom contains a recursive formula for finding the coefficients 
in the compositional inverse of F € G. Our next goal is to find other combinatorial and 
algebraic formulas for these coefficients that can sometimes be more convenient to use. Given 
F €G, write F = zF*, where F* = f, + foz + f3z2 +--+ is a formal series with nonzero 
constant term. By Theorem 11.11, the series F* has a multiplicative inverse R = pen Te 
with ro 4 0. Writing F(z) = z/R(z), we have FoG = z iff F(G(z)) = z iff G(z)/R(G(z)) = z 
iff G(z) = zR(G(z)) iff G= z-(RoG). 

It turns out that we can solve the equation G = z(RoG) by taking G to be the 
generating function for the set of ordered trees (or equivalently, terms) relative to a certain 
weight function. This idea is the essence of the following combinatorial formula for G. 


11.48. Theorem: Combinatorial Compositional Inversion Formula. Let F(z) = 
z/R(z) where R(z) = 07-9 rnz” is a given series in K[[z]] with ro 4 0. Let T be the set o 
all terms, and let the weight of a term w = wiw2---ws € T be wt(w) = ru, Two Tw. 2". 


Then G(z) = GF(T) = doer wt(w) is the compositional inverse of F(z). . 


Proof. For any two words v and w, we have wt(vw) = wt(v) wt(w). Also G(0) = 0, since 
every term has positive length. By Theorem 11.40, we know that for every term w € T, there 
exist a unique integer n > 0 and unique terms tj,...,¢, € T such that w = ntjto---tn. 
For fixed n, we build such a term by choosing the symbol n (which has weight zr,,), then 
choosing terms t; € T, tg € T,..., tn € T. By the Product Rule for Weighted Sets (adapted 
to weights satisfying the multiplicative condition wt(vw) = wt(v) wt(w)), the generating 
function for terms starting with n is therefore zr,G(z)". By the Infinite Sum Rule for 
Weighted Sets (§11.2), we conclude that 


Co 


G(z) = S° zrnG(z)” = zR(G(2)), 


n=0 


so G = z(RoG). By the remarks preceding the theorem, this shows that Fo G = z, as 
needed. Recall that Go F = z automatically follows, as in the proof of Theorem 11.47. O 


Theorem 11.46 provides a formula counting all terms in a given anagram class 
R(0*1*12*2 ...), Combining this formula with the previous result, we deduce the following 
algebraic recipe for the coefficients of G. 


11.49. The Lagrange Inversion Formula. Let F(z) = z/R(z) where R(z) = o> 9 Tn2” 
is a given series in K[[z]] with ro 4 0. Let G be the compositional inverse of F’.. For all 


n€Z>1, . 
G(z)|2n = = R(2)" len = (2) ree 


n! 


a) 


Proof. The second equality follows routinely from the definition of formal differentiation. 
To prove the first equality, let T;, be the set of terms of length n. By Theorem 11.48, we 


know that 
G(z)|.n = > Pw Two Twn: 
weTn 
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Let us group together summands on the right side corresponding to terms of length n that 
contain kg zeroes, k, ones, etc., where vido k; =n. Each such term has weight gtr ho pk ree, 
and the number of such terms is = provided that ko = 1+ >7;3,(¢ — 1)ki (see 
Theorem 11.46). Summing over all possible choices of the k;, we get 


ls » =(iosts,...) LL 


ko+kitket--=n, 120 
ko=14+0k1+1k2+42k3+4--- 


In the presence of the condition 7.) ki =n, the equation ko = 1+)0;.,(i— 1)k; holds iff 
Disolt — Dki = -1 iff 55 tki = 2 — 1. So = 


— 1 i hy 
o » s(ip,e,...) I" | 


kotkitket--=n, i20 
Oko +1ki+2ke+---=n—-1 
On the other hand, Theorem 5.35 shows that 
1 1 
— Rez)" [ens =— > ViyTig Vin: 


ty tteate+in=n—-1 


Each summand containing kg copies of ro, k, copies of 71, etc., can be rearranged to 
[liso rt, and there are ( ee ae, such summands by the Anagram Rule. So this formula for 


4R(z)” zm. O 


zn-1 reduces to the previous formula for G(z) 


11.50. Example. Given F(z) = z/e*, let us use Theorem 11.49 to find the compositional 
inverse G of F'. Here R(z) = e? = 0p.) 2"/k!, and R(z)” =e™ = oP (n*/k!)z*. So 


nr-l nl 


Glz)len = FR(2)" lana = Coss rR 


and Gz) = >, "ne". 


n= 


Summary 


e Algebraic Operations on Formal Power Series. K'|[z]] is the set of sequences 
(an :n € Zo) with alla, € K. Given F = ear Anz” and G = = bnz” in K[[z]], we 
define: 
a) Equality. F = G iff a, = b, for all n € Zs. 
) Addition. F+ G = Org (an + bn)z”. 
) Multiplication. F-G = yr 5 (pp GkOn—k) 2”: 
) Coefficient Extraction. For n > 0, F\z» = an and F'(0) = ao. 
) Order. If F 4 0, ord(F) is the least n with a, 4 0. 
Formal Differentiation. F’ = \>7-_) ndnz”~! = oP _g(m t+ l)dm4i2™. 
) Formal Composition. If G(0) = 0, FoG = Oy anG”. 
(h) Formal Exponentiation. If G(0) = 0, exp(G) = ef = 0°, G"/nl. 
(i) Formal Logarithm. if F(0) = 1, log(F) = 37°, (-1)""1(F -1)"/n. 


n=1 


e Limit Operations on Formal Power Series. Let F and F,, (for m € Zs) be formal 
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power series in A’[[z]]. 
(a) Limits. (Fn) > F iff lim Fn = F iff 
m—->oo 


VEE Z>o0, IM € Z>o,Vm = M, Fin\|z* = F k,n. 


oo N 
(b) Infinite Sums. a Ey, = F iff Jim. > ee oe 


m=0 m=0 


If all F, 4 0, the infinite sum converges iff lim ord(Fi,) = oo. 
m—-co 
oe) N 
vf ite P: ts. Fy = F iff li Fy =F. 
(c) Infinite Products IT iff lim IT 
If all Fin #0, []>_9(1 + Fin) converges iff some F,, = —1 or lim ord(F,) = oo. 
m—-oco 


Formal Continuity. A function p mapping a subset of K'[[z]] into K[[z]] is continuous iff 
for all F,, F in the domain of p, (Fi,) > F implies (p(Fi.)) > p(£). If the domain of p 
is a subset of A’[[z]] x A[[z]], continuity means that whenever (F;,) > F and (Gm) > G, 
(p(Fin, Gm)) > p(F,G). Formal addition, multiplication, coefficient extraction, differen- 
tiation, composition, exponentiation, and logarithm are all continuous. The composition 
of continuous functions is continuous. 


The Infinite Sum Rule for Weighted Sets. Suppose 5S; is a nonempty weighted set 
for each k € Zs, SNS; =O for all j Ak, and S =U, Se. Assume that for all k and 
all u € S;, wts(u) = wtgs,(u). For each k, let minwt(S;,) = min{wts,(u) : u © Sp}. If 
jim, minwt(S;) = oo, then GF(S; z) = TP, GF(Sx; z). 


The Infinite Product Rule for Weighted Sets. For each k € Z>1, let Sj, be a weighted 
set that contains a unique object o, of weight zero. Assume jim minwt(S;, — {o,}) = oo. 
s—rOO 


Suppose S' is a weighted set such that every u € S can be constructed in exactly one way 
as follows. For each k > 1, choose uz € Sz, subject to the restriction that we must choose 
ox for all but finitely many k’s. Then assemble the chosen objects in a prescribed manner. 
Assume that whenever u is constructed from (up : k > 0), wts(u) = SOP2, wts, (ur). 
Then GF(S; z) = [] 7, GF(Sx; z). 


Multiplicative Inverses of Formal Series. F = 0°) az” € K|[z]] is invertible iff 
there exists G = 079 unz” € K[[z]] with FG = 1 iff ap £0. 

(a) Recursion for Coefficients of 1/F'. When ap # 0, the coefficients of G are determined 
recursively by ug = ds and Un = a," i GpUn—k for all n > 0. 

(b) Closed Formula for Coefficients of 1/F. When ag 4 0, 


n 
2 =1)" 
Un = Fn = 5 a a Diy Wig * + * Bing 


m=0 ~0 (i ,2,-..,im)EZZo: 
ty tigate +im=n 


(c) Formal Geometric Series. When aj = 0, (1— F)7' = 0_) F™. 

(d) Symmetric Function Formula for 1/F. H(z) = \37-_)(—1)"hnz” is the multiplicative 
inverse of E(z) = eae €nz”", where e, and h, denote elementary and complete symmetric 
functions. If ¢: A K is the homomorphism sending e,, to ay, then un, = o((—1)"An). 
(e) Exponential Formula for 1/F'. When ap = 1, 1/F = exp(— log(F)). 


Partial Fractions. Given polynomials f,g € C[z] with g(0) = 1, let g = iG earne. — rz)” 
where the r; are distinct nonzero complex numbers. There exist a unique polynomial 
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he ce |] and unique complex numbers a;; (for 1 < i< k and 1< j < n;) with f/g = 
cee, ye Giy/(1 — riz)’. The coefficient of z” in the power series expansion of f/g 


is (F/a)ler = hlen +08, EM, agg 


Recursions with Constant Coefficients. Given constants c;,d; € C and g: Zs; — C, 
let (a, : n > 0) be defined recursively by an = C1dn—1+C€2Gn—2++ ++ +CpGn—K+ g(n) for n > 
k, with initial conditions a; = d; for 0 <1 < k. The ee function F = yr? 4 anz” is 
given by F = G/P, where P = 1—c12z—c227---- te G= yo dz! +h, 9(n)2” 
and di’ = d; = cy dj_4 = codj_2 Sr cdo. 


Properties of Formal Composition. For fixed G € K|[z]] with G(0) = 0, the map 
Rg : KI{z]] > K[z]] given by Re(F’) = FoG is the unique continuous K-algebra 
homomorphism sending z to G. The set G = {F € K|[z]] : F|,o = 0 and F|,:1 4 0} is 
a group under formal composition. In particular, formal composition is associative when 
defined, with identity element z. 


Properties of Formal Derivatives. For all F,G € K|[z]]: 
(a) The Sum Rule. (F + G) = F’+G’. 
(b) The Scalar Rule. For all c € K, (cF’)! = c(F"). 
(c) The Product Rule. (F'-G)' = (F")-G+F-(G’). 
(d) The Power Rule. For all n € Zo, (F")’ =nF""1- F’. 
(e) 


e) The Chain Rule. If G(0) = 0, then 4 [F(G(z))] = F’(G(z)) - G’(z). 


Properties of Formal Exponentials. Let F', G, and Gy, (for k € Zso) be formal power 
series walt constant term zero. 

(a) £ exp(z) = exp(z), exp(0) = 1, and £ exp(P(z)) = exp(F(2)) (2). 

(b) or ) £0, and exp(—F) = 1/ exp(F). 

(c) exp(F’ + G) = exp(F) expiG)s 

(d) For all N € Zso, exp(>*_, Ge) = TIL, exp(Ge)- 

(ec) If °°, G; converges, then []7—_, exp(G;,) converges to exp(3-72, Gx). 


Properties of Formal Logarithms. Let F, G, H, and H; (for k € Zso) be formal 
power series with F'(0) = 0 and G(0) = H(0) = H;,(0) =1. 

(a) te log(1 + z) = (1+ z)7?, log(1) = 0, 4 log(1 + F(z)) = F’(z)/(1+ F(z), and 

# log(G(z)) = G'(2)/G(2). 

(b) Tos(exp (i )) = F and exp(log(H)) = H. 

(c) log(GH) = log(@) + log(#). 

(d) For all N € Zyo, log(T],_ ;k) = = log(Hy.). 

(e) If []~., He converges, then 5>7~ , log(H;,) converges to log([]~_, Hx). 


The Exponential Formula. Let C and S be sets such that each object x in C or S 
has a weight wt(a) and a size sz(x), where sz(c) > 0 for all c € C. Suppose that every 
object s € S of size n > 0 can be constructed uniquely by the following process. First, 
choose a set partition P of {1,2,...,n}. Next, for each block B in the set partition P, 
choose a connected structure c(B) from C such that sz(c(B)) = |B|. Finally, assemble 
these choices in a prescribed manner to produce s. Assume that whenever s is built from 
P and (c(B) : B € P), wt(s) = 0 pep wt(c(B)). Define generating functions Go = 
Deeg 10209 / sz(c)! and Gg = Dyeg tY*9) 29) / sz(s)!. Then Gg = exp(Go). 


Terms and Ordered Trees. For every term T, there exist a unique integer n > 0 
and unique terms J7\,...,Z;, such that T = nT,T>---T,. A word w,--+-we is a term iff 
wy +-:-+w;—i> 0 for allz < s and w; +---+w,—s = —1. No proper prefix of a term 
is a term. Terms correspond bijectively to ordered trees. 
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e Lists of Terms and Ordered Forests. Every list of terms has the form T|--- 7), for 
some unique integer n > 0 and unique terms 7),...,7,. A word w,---wes is a list of n 
terms iff wy +---+w; —7 > —n for alli < s and w, +---+w, —s = —n. Lists of terms 
correspond bijectively to ordered forests. 


The Cycle Lemma for Counting Lists of Terms. For a word w = w,:::-w, with 
wi, +---+ws, —s = —n, there exist exactly n cyclic shifts of w that are lists of n terms. 
Consequently, the number of lists of n terms using k; copies of i (for 0 < i < ft) is 
Eade ck \ where s = oy k; and kb =n + ya — 1)kj. 


s 


e Compositional Inverses of Formal Series. Given F = 0) anz" with ap = 0 4 a1, 


F has a compositional inverse G = Sead bnz” with coefficients given recursively by 
by = 0, b; = 1/a1, and b, = (—1/a?) bee omF” _, Writing F = 2/R where R = 
ro tne” € K[[z]] and ro 4 0, G is also the generating function for the set of terms, 
where the weight of a term wy +++ Wp is Z"Tw, +++ Tw,- For alln > 1, bp = 4(R(z)")|2n-1 = 
at [(d/dz)"* R(z)"]] ,0 


30" 


Exercises 


11-1. Let f = z— 27 +324 and g = 1 — 2z — 3z*. Compute f +g, fg, and the degrees and 
orders of f, g, f +g, and fg. 

11-2. Let F = (1,0,1,0,1,0,...) and G = °° nz”. Compute F + G, FG, F(1+ 2), 
F(1 — 27), G(1 +z), F’, G’, and the orders of these formal power series. 


11-3. Prove that K[[z]] is a commutative ring, a vector space over K, and a K-algebra by 
verifying the axioms. 


11-4. Prove that K[z] is a subring, subspace, and subalgebra of A’[[z]]. 
11-5. (a) Prove Theorem 11.2. (b) Deduce that AK[z] and K'[[z]] have no zero divisors. 


11-6. When does equality hold in the formulas deg(P + Q) < max(deg(P), deg(Q)) and 
ord(F' + G) > min(ord(F’), ord(G)) from Theorem 11.2? 

11-7. Given nonzero Fy, € K|[z]], prove (Fi) > 0 in K[[z]] iff (ord(F},)) > co. 

11-8. Fix G € K|[z]]. Prove (G”) — 0 iff G = 0 or ord(G) > 0. 

11-9. Suppose (F;,) > F with F,,F € K|[z]]. Prove: for each k > 0, there exists M > 0 
such that for allm > M and alli in the range0 <i<k, Fy|i =F 
11-10. Finish the proof of Theorem 11.5. 

11-11. Let G = (b, : n > 0) € K|[z]] and define F,, = byz™ € K[[z]] for each m > 0. 
Prove 9 Fim = G. 

11-12. Finish the proof of Theorem 11.7. 


11-13. Given a sequence (F;,) of formal series, let (Fj,,,) be the subsequence of nonzero 
terms of (F;,). (a) Prove 0°.) Fn converges to the sum G iff )°°°_, Fi, converges to the 
sum G. (b) Prove []*-_9(1 + Fn) converges to the product G iff [[7~_)(1 + Fy) converges 
to the product G. 

11-14. Density of K[z] in K|[z]]. Show that for all F' € K[{z]], there exists a sequence of 
polynomials P,, € K[z] with (P,) > F. 


zis 


! 
11-15. For fixed m,n € Zyo, evaluate y 7 
0 


loom MI btnle where the sum extends 
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over all lists (ko, k1,...,kn) € Zit such that kg +k, +---+k, =m and Oko t+1ki+:--+nkpn = 
n. 


11-16. Carefully justify the following calculation: 


[[@-2") = ]fa-2) ][a-~)7 = [[a+. 
n=1 i=1 j=l k=1 


In particular, explain why all the infinite products appearing here converge. 

11-17. Carefully check that the infinite products at the end of §9.28 converge to the indi- 
cated formal series. 

11-18. Find a necessary and sufficient condition on series F}, € A [[z]] so that the infinite 
product []72,(1+ Fy)! exists. 

11-19. Evaluate [J 9(1 + 2’). 

11-20. Ideal Structure of K|z]. Show that every nonzero ideal J in the ring K[z] has 
the form (P) = {PQ : Q € K[z]} for some monic polynomial P € K{z]. [Hint: Let P be 
a monic polynomial in J of least degree. Use polynomial division with remainder to show 
T= {P), 

11-21. Ideal Structure of K[[z]]. Show that every nonzero ideal J in the ring K[[z]] has 
the form (z”) = {z™G: Ge K[[z]]} ={o_,, an” : dn € K} for some m € Zo. Draw a 
diagram of the poset of all ideals of K[[z]] ordered by set inclusion. 

11-22. Formal Laurent Series. A formal Laurent series is a sequence F' = (a, :n € Z), 
denoted F(z) = oe G@nz", such that all a, are in K, and for some d € Z, an = OK 
for all n < d. Let K((z)) be the set of all such formal Laurent series. Define a K-algebra 
structure on A((z)) by analogy with K[[z]], and verify the algebra axioms. 

11-23. Prove that the Laurent series ring A ((z)) is a field containing K'|[z]]. Prove that 
every F € K((z)) has the form F = GH~! for some G, H € K|[z]]; in fact, H can be chosen 
to be a power of z. 

11-24. Compute the multiplicative inverse of }>°°_,?z” in K((z)). 

11-25. Convert the following expressions to formal Laurent series: (a) (z? + 3)/(z° — z?); 
(b) z/(z3 — 52? + 6z). 

11-26. Differentiation of Laurent Series. Define a version of the formal derivative oper- 
ator for the ring A’((z)) of formal Laurent series. Extend the derivative rules (in particular, 
the Quotient Rule) to this ring. 

11-27. Carefully check the final equality in the proof of the Infinite Sum Rule for Weighted 
Sets. 

11-28. Generalize the Infinite Sum Rule 11.9 to the case where finitely many of the sets S;, 
have multiple objects of weight zero. 

11-29. Prove: for all n € Zs, and all k € Z,, k is invertible in the ring Z,, iff gcd(k, n) = 1. 
11-30. Check that the sequence (u,,) defined below (11.1) is the unique solution to the 
system (11.1). 

11-31. Solve the system (11.1) to find the first five terms in the multiplicative inverse of 
each of the following series: (a) e”; (b) 1 — 2z + 23 +324; (c) 1+ log(1 +z). 

11-32. Use the Formal Geometric Series Formula to find the first five terms in (1—z+23)71. 
11-33. Use (11.1) to find the first nine coefficients in the multiplicative inverse of the formal 
series cos z = 1 — z?/2!+ z4/4!—---. 

11-34. Suppose F(z) = 2— 6z + 3z2 + 523 — 24 +---. Say as much as you can about the 
coefficients of the power series 1/F’. 
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11-35. Suppose F' € K|[z]] is a formal series such that ord(F’) = 0 and there is k € Zyo 
such that F|,» = 0 whenever n is not divisible by k. Show that 1/F also has this property. 


11-36. Use geometric series to find the inverses of these formal series: (a) 1 —3z; (b) 1+ 2°; 
(c) a— bz where a,b £ 0. 

11-37. Prove by induction on m: for all G € K[[z]] and all m € Zso, (1— G) yi, G* = 
1-@™. 

11-38. Derive (11.3) from (11.1). 

11-39. Check that ¢* defined in (11.4) is a K-algebra homomorphism. 

11-40. (a) Prove: for all n € Zso, 


n 


(-1)"hn = 5- (-1)™ x Cis Cig Cin: 


m=0 (%1,42,---;im)EZY: 
ti ttet-+im=n 


(b) Use part (a) to show that the symmetric function formula for the multiplicative inverse 
of a formal power series agrees with Formula (11.3). 


11-41. Find the partial fraction decomposition and the power series expansion of each ratio 
of polynomials. 

(a) F(z) = (10 + 2z)/(1 — 2z — 827). 

(b) F(z) = (1 —7z)/(152? — 8z 4+ 1). 

(c) F(z) = (2z3 — 42? — z — 3)/(2z" — 4z + 2). 

(d) F(z) = (1526 + 3025 — 1524 — 3529 — 152” — 12z — 8) /(15(24 + 228 — 2z—1)). 
11-42. Evaluation Homomorphisms for Polynomial Rings. Let A be any K-algebra. 
Prove that for all a € A, there exists a unique K-algebra homomorphism E, : K[z] > A 
such that E,(z) =a. This map is called the evaluation homomorphism determined by a. 


11-43. Prove Theorem 11.24(b). [Hint: One approach is to use the previous exercise 
and 11.24(a).] 

11-44. Even and Odd Formal Series. A series F € K[[z]] is even iff F(—z) = F; F is 
odd iff F(—z) = —F. (a) Show that F is even iff F|,» = 0 for all odd n, and F is odd iff 
F |,» = 0 for all even n. (b) Give rules for determining the parity (even or odd) of F' + G, 
FG, and (when defined) F~, given the parity of F and G. 

11-45. (a) Show that the differentiation map D : K[[z]] > K|[z]], given by D(F) = F' 
for F € K[[z]], is continuous and K-linear. (b) For fixed H € K|[z]], show that the map 
My: K|[z]] > K|[z]], given by M(F) = F - A for F € K|[z]], is continuous and K-linear. 
11-46. Prove: if F = OP.) Fe in K[[z]], then F’ = Oe FR. 

11-47. Given K-algebras A, B, C and functions p: A > B and g: B > C. (a) Prove 
that if p and q are K-linear, then gop: A — C is K-linear. (b) Prove that if p and gq are 
K-algebra homomorphisms, then go p is a K-algebra homomorphism. 


11-48. Uniqueness of Evaluation Homomorphisms. Prove: Given G € K|[z]] with 
G(0) = 0, if h : K[[z]] > K[[z]] is any continuous K-algebra homomorphism such that 
h(z) = G, then h(F’) = F'0G for all F € K|[z]] (so that h = Re). 

11-49. Complete the following outline to give a new proof of the Formal Product Rule 
(F-G)' = F’.G+F-G’' for F,G € K|[z]]. (a) Show that the result holds when F = z‘ and 
G = z/, for all i,j € Zso. (b) Deduce from (a) that the result holds for all F,G € K[z]. 

(c) Use a continuity argument to obtain the result for all F,G € K|[z]]. 


11-50. Verify that the maps p and qg used in the proof of the Formal Chain Rule (Theo- 
rem 11.25(e)) are A-linear and continuous. 
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11-51. Suppose p,q: K[[z]] > K[[z]] are continuous K-linear maps such that p(z”) = q(z”) 
for alln € Zyo. Prove that p = q. 

11-52. The Formal Quotient Rule. Suppose F,G € K|[x]] where G(0) 4 0. Prove the 
derivative rule (F/G)! = (G. F’ — F- G’)/G?. 

11-53. Formal Integrals. The formal integral or antiderivative of a formal series F = 
9 an2" € K[[z]] is the formal series 


which has constant term zero. Compute the formal integrals of the following formal power 
series: (a) 34+2z—7z?+12z°; (b) 2) n?2"; (c) 2 (n+1)!2"; (d) e*; (e) sin z; (f) cos z; 
(g) 1+z)~; (b) 3b. 
11-54. Prove the following facts about formal integrals. 
(a) The Sum Rule: fF + Gdz = [ Fdz+ f Gdz for all F,G € K[[z]]. 
(b) The Scalar Rule: f cF dz =c f{ Fdz for allce K and F€ K[[z]]. 
(c) The Linearity Rule: f S7j_, cc Hj dz = 07_, c f Hi dz for c; € K and H; € K[[z]]. 
Can you formulate a similar statement for infinite sums? 
(d) The Power Rule: f z* dz = gzz**? for all k > 0. 
(e) General Antiderivatives: For all F,G € K|[z]], G’ = F iff there exists c € K with 
G=fFdz+e. 
(f) The Formal Fundamental Theorems of Calculus: For all F' € K[[z]], 
F=4/f Fdzand f{ F’dz=F—F(0). 
(g) Continuity of Integration: For all F;,, H € K|[z]], if (Fi,) ~ H then 
11-55. Formulate and prove an Integration by Parts Rule and a Substitution Rule for formal 
integrals. 
11-56. (a) Prove that (sin z)? + (cos z)? = 1 in K[[z]] by computing the coefficient of 2” on 
each side. (b) Prove that (sin z)? + (cos z)? = 1 in K[[z]] by showing the derivative of the 
left side is zero. 


11-57. The Product Rule for Multiple Factors. Let F,,...,F, € K[[z]]. Prove that 


k 

d d 

a7 (AiFa:+ Fi) = S Fy-++Fy4 (£5) Fyyi-++ Fy. 
j=1 


Does a version of this rule hold for infinite products? 

11-58. Use evaluation homomorphisms to show that when F' € K[[z]] is a polynomial, we 
can define Fo G for all G € K|[z]], not just those G with G(0) = 0. Extend the results 
of §11.6 to this setting. 

11-59. Compute the first four nonzero terms in: (a) exp(sin x); (b) log(cos x). 

11-60. Suppose F,G € K|[z]] satisfy F(0) = 0 = G(0). Prove exp(F' + G) = exp(F’) exp(G) 
by computing the coefficient of z” on both sides. 

11-61. Prove Theorem 11.27 parts (d), (e), and (f). 

11-62. Prove Theorem 11.29 parts (d), (e), and (f). 

11-63. Let X = {F © K|[z]] : F(O) = 0} and Y = {G € K[[z]] : G(0) = 1}. Prove that 
exp: X > Y is a bijection with inverse log: Y > X. 


11-64. Formal Powers. Given F € K|[z]] with F(0) = 1 and r € K|[z]], define the formal 
power FE = exp(r log(F’)). Prove the following facts. (a) F'” is well-defined, and F'"(0) = 1. 
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(b) For F,r,s € K|[z]] with F(0) =1, F'+® = F'F*. (c) For Fr € K|[z]] with F(0) = 1, 
FO = 1/F". (d) For F,G,r € K[[z]] with F(0) = G(0) = 1, (FG)" = F’G’. (e) For 
F,r € K|[z]] with F(0) = 1, log(F") = rlog(F). (f) F° = 1; if n is a positive integer, F” 
(as defined here) equals the product of n copies of F’; and if n is a negative integer, F'” (as 
defined here) equals the product of |n| copies of 1/F. (g) For F,r € K[[z]] with F(0) = 1, 
Fr =rF’~!.F’. (h) The operation sending (F,r) to F” is continuous. 
11-65. Prove a formal version of the Extended Binomial Theorem: for all r € K/[z]], 
(1+2)" = we Ma mn where (1 + z)” is defined in the previous exercise. 
n! 

n=0 
11-66. Formal nth Roots. Given F € K|[z]] with F(0) = 1 and a positive integer n, show 
there exists a unique G € K|[z]] with G(0) = 1 and G” = F, where G” is the product of n 
copies of G. 
11-67. State and prove a formal version of the Quadratic Formula for solving AF? + 
BF +C=0, where A,B,C € K|[z]] are known series and F € K|{z]] is unknown. What 
hypotheses must you impose on A, B, C? Is the solution F’ unique? 


11-68. (a) Find the generating function for the set of set partitions of one of the sets 
{1,2,...,n} where every block has more than one element, weighted by number of blocks. 
(b) Find the number of such set partitions of {1,2,...,9}. (c) Find the number of such set 
partitions of {1,2,...,12} with four blocks. 


11-69. (a) Find the generating function for the set of set partitions of one of the sets 
{1,2,...,n} where all blocks have even size, weighted by number of blocks. (b) Find the 
number of such set partitions of {1,2,...,12}. (c) Find the number of such set partitions 
of {1,2,...,16} consisting of four blocks. 


11-70. (a) Find the generating function for the set of permutations of one of the sets 


{1,2,...,n} where no cycle has length more than five, weighted by number of cycles. 
(b) Find the number of such permutations of {1,2,...,10}. (c) Find the number of such 
permutations of {1,2,...,12} having four cycles. 


11-71. For 1 <n < 6, find the number of connected simple graphs with n vertices. 


11-72. (a) Find the generating function for connected simple digraphs with vertex set 
{1,2,...,n} for some n > 1 (see Definition 3.50). (b) Use (a) to find a summation formula 
counting connected simple digraphs with n vertices. (c) For 1 <n < 7, find the number of 
connected simple digraphs with n vertices. 


11-73. Carry out the computations showing how the equation Go = log(G'g) leads to 
formula (11.11) in Example 11.33. 


11-74. Rewrite (11.11) as a sum over partitions of n. 


11-75. (a) Modify (11.11) to include a power of ¢ that keeps track of the number of edges in 
the connected graph. (b) How many connected simple graphs with vertex set {1, 2,3, 4,5, 6} 
have exactly seven edges? 

11-76. Let S be the set of simple graphs with vertex set {1,2,...,n} for some n > 0 
such that every component of the graph is an undirected cycle, weighted by number of 
components. (a) Find Gg. (b) Find the number of such graphs with 12 vertices and 5 
components. 

11-77. A star is a tree with at most one vertex of degree greater than 1. Let S be the set 
of simple graphs with vertex set {1,2,...,n} for some n > 0 such that every component of 
the graph is a star, weighted by number of components. (a) Find Gg. (b) Find the number 
of such graphs with 14 vertices and 4 components. 


11-78. Verify that +(R(z)”)|2n-1 = 4(d/dz)""1R(z)"|,0 for all n € Zp; and Re K|[[z]]. 


nl 
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11-79. Use (11.12) to find the first several coefficients in the compositional inverses of each 
of the following power series: (a) sin z; (b) tan z; (c) z/(1— 2). 

11-80. (a) List all terms of length at most 5. (b) Use (a) and Theorem 11.48 to find explicit 
formulas for the first five coefficients of the compositional inverse of z/R(z) as combinations 
of the coefficients of R(z). (c) Use (b) to find the first five terms in the compositional inverse 
of z/(1 — 3z + 227 + 524). 

11-81. Use Theorem 11.49 to compute the compositional inverse of the following formal 
series: (a) ze?*; (b) z— 27; (c) z/(1+ az); (d) z— 4244 42”. 

11-82. (a) Find a bijection from the set of terms of length n to the set of binary trees 
with mn — 1 nodes. (b) Use (a) to formulate a version of Theorem 11.48 that expresses the 
coefficients in the compositional inverse of z/R(z) as sums of weighted binary trees. 


11-83. (a) Find a bijection from the set of terms of length n to the set of Dyck paths ending 
at (n—1,n— 1). (b) Use (a) to formulate a version of Theorem 11.48 that expresses the 
coefficients in the compositional inverse of z/R(z) as sums of weighted Dyck paths. 


11-84. Let A(n,k) be the number of ways to assign n people to k committees in such a way 
that each person belongs to exactly one committee, and each committee has one member 


designated as chairman. Find a formula for )>7°_4 So; ACR) pe yr 


11-85. Formal Linear Ordinary Differential Equations (ODEs). Suppose P,Q € 
K|[z]] are given formal series, and we are trying to find a formal series F’ € K[[z]] satisfying 
the linear ODE F’ + PF = Q and initial condition F(0) = c € K. Solve this ODE by 
multiplying by the integrating factor exp({ Pdz) and using the Product Rule to simplify 
the left side. 


11-86. Formal ODEs with Constant Coefficients. Let V be the set of all formal series 
F €C[[z]] satisfying the ODE 


PY) 4 FOOD 4 FO?) 4... +e,F =0. (11.13) 


k k-1 k-2 


The characteristic polynomial for this ODE is q(z) = 2° +c1z + 02% Fe+++c, € C[z]. 
Suppose q(z) factors as (z — r1)*1-+-(z —r.)** for certain k; > 0 and distinct r; € C. 
(a) Show that the k series 2/ exp(riz) (for 1 < i < s and 0 < j < kj) are in V. (b) Show 
that V is a complex vector space, and the k series in (a) form a basis for V. (c) Describe a 
procedure for expressing a given sequence F' € V as a linear combination of the sequences 
in the basis from part (a), given the initial conditions F(0), F’(0),...,F‘*~?(0). (d) Let 
W be the set of formal series G € C|{z]] satisfying the non-homogeneous ODE 


G® + GP) 4 Gh) +... +¢.G = H, 


where H € C|[z]] is a given series. If G* is one particular series in W, show that W = 
{F+G*: FeV}. 


DS 


Notes 


A detailed but very technical treatment of the algebraic theory of polynomials and formal 
power series is given in [18, Ch. IV]. Discussions of formal power series from a more combi- 
natorial perspective may be found in [121, Ch. 1] and [132]. Our treatment of compositional 
inversion closely follows the presentation in [102]. 
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Additional Topics 


This chapter covers a variety of topics illustrating different aspects of enumerative combi- 
natorics and probability. The treatment of each topic is essentially self-contained. 


ee 


12.1 Cyclic Shifting of Paths 


This section illustrates a technique for enumerating certain collections of lattice paths. The 
basic idea is to introduce an equivalence relation on paths by cyclically shifting the steps 
of a path. A similar idea was used in §11.11 to enumerate lists of terms. 


12.1. Theorem: Rational-Slope Dyck Paths. Let r and s be positive integers such 
that gced(r, s) = 1. The number of lattice paths from (0,0) to (r,s) that never go below the 


diagonal line sa = ry is 
1 r+s 
rt+s\ r,s )° 


We call the paths counted by this theorem Dyck paths of slope s/r. 


Proof. Step 1. Let X = R(E"N*), which is the set of all rearrangements of r copies of E and 
s copies of N. Thinking of E as an east step and N as a north step, we see that X can be 
identified with the set of all lattice paths from (0,0) to (r,s). Given v = v1U2-+++Up+s € X, 
we define an associated label vector L(v) = (mo,m1,...,Mr+s) as follows. We set mo = 0. 
Then we recursively calculate m; = mj_1 +r if vu; = N, and m; = m;_1 — s if vu; = E. For 
example, if r = 5, s = 3, and v = NEEENENE, then L(v) = (0,5,2,—1, —4, 1, —2,3,0). We 
can also describe this construction in terms of the lattice path encoded by v. If we label each 
lattice point (x,y) on this path by the integer ry — sx, then L(v) is the sequence of labels 
encountered as we traverse the path from (0,0) to (r,s). This construction is illustrated 
by the lattice paths in Figure 12.1. Note that v is recoverable from L(v), since v; = N iff 
mM; —m;-1 =7T, and v; = E iff mj; — mj_-1 = —s. 

Step 2. We prove that for all v € X, if L(v) = (mo,m1,...,™mMr4s) then mo, m1, ..., 
Mr+s—1 are distinct, whereas m;+; = 0 = mo. To see this, suppose there exist x, y, a, b with 
0<a<rand0<b<-s, such that (z,y) and (a +a,y +5) are two points on the lattice 
path for v that have the same label. This means that ry — sx = r(y + 6) — s(a +a), which 
simplifies to rb = sa. Thus the number rb = sa is a common multiple of r and s. Since 
gcd(r, s) = 1, we have lcm(r,s) = rs, so that rb > rs and sa > rs. Thus b> s anda>r, 
forcing b = s anda=r. But then (x,y) must be (0,0) and (# +a,y +6) must be (r,s). So 
the only two points on the path with equal labels are (0,0) and (r,s), which correspond to 
mo and Mr+s5- 

Step 8. Introduce an equivalence relation ~ on X (see Definition 2.50) by setting v ~ w 
iff v is a cyclic shift of w. More precisely, defining C(wiw2 ++: wWr+s) = W2W3 +++ WppsW1, We 
have v ~ w iff v = C*(w) for some integer i (which can be chosen in the range 0 < i < r+s). 
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FIGURE 12.1 
Cyclic shifts of a lattice path. 


For each v € X, let [vu] = {w © X : w ~ v} be the equivalence class of v relative to this 
equivalence relation. Figure 12.1 shows the paths in the class [NEEENENE]. 

Step 4. We show that for all v € X, the equivalence class [v] has size r + s, which 
means that all r + s cyclic shifts of v are distinct. Suppose v = v1 v2°+-+Up+s5 has L(v) = 
(mo,™1,...,Mr+s). By definition of L, for each i with 0 < i < r+s, the label vector of the 
cyclic shift C*(v) = vig. +++ Urpe¥1 +++ 0; is 


L(C*(v)) = (0, mi44 = Mi, Mi+2 — Mi,--. > Mr+s — M4, — TN, -- MG — m;) 


(see Figure 12.1 for examples). The set of integers appearing in the label vector L(C*(v)) is 
therefore obtained from the set of integers in L(v) by subtracting m; from each integer in 
the latter set. In particular, if w is the smallest integer in L(v), then the smallest integer in 
L(C*(v)) is 4 —m,. Since the numbers mo,™m1,...,Mr4s—1 are distinct (by Step 2), we see 
that the minimum elements in the sequences L(C*(v)) are distinct, as i ranges from 0 to 
r+s—1. This implies that the sequences L(C*(v)), and hence the words C*(v), are pairwise 
distinct. 

Step 5. We show that, for all v € X, there exists a unique word w € [v] such that w 
encodes a Dyck path of slope s/r. By the way we defined the labels, w is a Dyck path of 
slope s/r iff Z(w) has no negative entries. We know from Step 4 that the set of labels in 
L(C*(v)) is obtained from the set of labels in L(v) by subtracting m; from each label in 
the latter set. By Step 2, there is a unique 7 in the range 0 <2 < r+ such that m; = p, 
the minimum value in L(v). For this choice of i, we have m; > 4 = m, for every j, so that 
mj —mj; > 0 and L(C’(v)) has no negative labels. For any other choice of 7, m; > by Step 
2, so that L(C*(v)) contains the negative label pz — mj. 

Step 6. Suppose ~ has n equivalence classes in X. By Step 5, n is also the number of 
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(n+1,n) 


FIGURE 12.2 
Comparing Dyck paths to Dyck paths of slope n/(n + 1). 


Dyck paths of slope s/r. By Step 4, each equivalence class has size r +s. Since X is the 
disjoint union of its equivalence classes, we see from the Sum Rule and the Anagram Rule 


that 
("" *) =|X|=n(rt+s). 
r,s 
Dividing by r +s gives the formula stated in the theorem. O 


We can use the previous theorem to enumerate Dyck paths and certain m-ballot num- 
bers, as follows. 


12.2. Corollary. For all n € Zs1, the number of Dyck paths ending at (n,n) is 


1 2n+1 
Qn+1\n+1,n/)° 


For all m,n € Zs , the number of lattice paths from (0,0) to (mn,n) that stay weakly 


above « = my is 
1 (m+1)n+1 
(m+1)n+1\ mn+i,n ) 


Proof. Let X be the set of Dyck paths ending at (n,n), and let X’ be the set of Dyck 
paths of slope n/(n + 1) ending at (n + 1,n). Since ged(n + 1,n) = 1, we know that 
|X'| = ST Cae On the other hand, tilting the line y = «x to the line (n+1)y = na does 
not introduce any new lattice points in the region visited by paths in X and X’, except 
for (n + 1,n), as shown in Figure 12.2. It follows that appending a final east step gives a 
bijection from X onto X’, proving the first formula in the corollary. The second formula is 
proved in the same way: appending a final east step gives a bijection from the set of lattice 


paths described in the corollary to the set of Dyck paths of slope n/(mn + 1). oO 


DT 


12.2 The Chung—Feller Theorem 


In 81.14, we defined Dyck paths and proved that the number of Dyck paths of order n is 


the Catalan number C,, = a (7")- This section discusses a remarkable generalization of 


this result called the Chung—Feller Theorem. 
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12.3. Definition: Flawed Paths. Suppose 7 is a lattice path from (0,0) to (n,n), viewed 
as a sequence of lattice points: 7 = ((%0, yo), ---, (®2n; Yon)). For 1 < 7 < n, we say that 7 
has a flaw in row j iff there exists a point (2;, y;) visited by 7 such that y; = 7 —1, yi < vi, 
and (%:41, yi+1) = (ai, yi +1). This means that the jth north step of 7 occurs in the region 
southeast of the diagonal line y = x. For 1 < j < n, define 


i= 1 if 7 has a flaw in row J; 
i’) 0. otherwise. 


Also define the number of flaws of a by setting flaw(7) = X1(m7) + Xo(m) +--+ + Xy(z). 


For example, the paths shown in Figure 12.3 have zero and six flaws, respectively. The 
paths shown in Figure 12.4 have five and zero flaws, respectively. Observe that 7 is a Dyck 
path iff flaw() = 0. 


12.4. The Chung—Feller Theorem. Fix n € Zo, and let A be the set of lattice paths 
from (0,0) to (n,n). ForO <k <n, let Ay = {a € A: flaw(7) = k}. Then |A;| = |Ao| for 
all k. In particular, forO<k <n, 


| A| 1 2n 
A,| = —~ = = Cy: 
[As n+l n4+1\n,n 


Proof. Fix k with 0 < k <n. To prove that |Ao| = |Ax|, we define a bijection @, : Ap > Ar. 
See Figure 12.3 for an example where n = 10 and k = 6. Given a Dyck path 7 € Ao, we 
begin by drawing the line y = k superimposed on the Dyck path. There is a unique point 
(x;, Yi) on 7 such that y; = k and 7 arrives at (2;, y;) by taking a vertical step. Call this 
step the special vertical step. Let (a1, a2,...) € {H, V}?"~*‘—¥ be the ordered sequence of 
steps of 7 reading northeast from (a;,y;), where H means horizontal step and V means 
vertical step. Let (bo, bi, b2,...) € {H, V}“**% be the ordered sequence of steps of 7 reading 
southwest from (2;, y;), where bo = V is the special vertical step. For the Dyck path shown 
on the left in Figure 12.3, we have 


Q142°°'a41 = VHVHHHVHVHH; boby Sasi bg = VVHVHVVHV. 


We compute the lattice path $,(7), viewed as a sequence of steps ¢1C2- ++ Can € {V, H}2”, as 
follows. Let c, = a1, C2 = ag, etc., until we reach a horizontal step cy, = ax that ends strictly 
below the diagonal y = x. Then set cp41 = 01, Ce42 = ba, etc., until we reach a vertical step 
Ck+m = bm that ends on the line y = x. Then set Ckim+1 = Gk+1, Ck+m+2 = Or+2, ete., 
until we take a horizontal step that ends strictly below y = x. Then switch back to using 
the steps bm+1, bm+2, etc., until we return to y = x. Continue in this way until all steps are 
used. By convention, the special vertical step bp = V is the last step from the b-sequence to 
be consumed. 

For example, given the path 7 in Figure 12.3, we have labeled the steps of the path A 
through T for reference purposes. The special vertical step is step I. We begin by transferring 
steps J, K, L, M, N to the image path (starting at the origin). Step N goes below the diagonal, 
so we jump to the section of 7 prior to the special vertical step and work southwest. After 
taking only one step (the vertical step labeled H), we have returned to the diagonal. Now 
we jump back to our previous location in the top part of 7 and take step O. This again 
takes us below the diagonal, so we jump back to the bottom part of 7 and transfer steps G, 
F, E, D, C. Now we return to the top part and transfer steps P, Q, R, 5, T. Then we return 
to the bottom part of 7 and transfer steps B, A, and finally the special vertical step I. 

This construction has the following crucial property. Vertical steps above the line y = k 
in 7 get transferred to vertical steps above the line y = x in ¢,(7), while vertical steps 
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FIGURE 12.3 
Mapping Dyck paths to flawed paths. 


below the line y = k in w get transferred to vertical steps below the line y = x in ¢,(7). 
Thus, ¢,(7) has exactly k flaws, so that ¢,(7) is an element of Ax. 

Moreover, consider the coordinates of the special point (x;, y;). By definition, y; = k = 
flaw(@;(7)). On the other hand, we claim that y; — 7; equals the number of horizontal steps 
in ¢,(7) that start on y = x and end to the right of y = a. Each such horizontal step came 
from a step northeast of (;,y;) in 7 that brings the path a closer to the main diagonal 
y =x. For instance, these steps are N, O, and T in Figure 12.3. The definition of 6; shows 
that the steps in question (in 7) are the earliest east steps after (7;,y;) that arrive on the 
lines y= a+d ford=y; — 2; —1,...,2,1,0. The number of such steps is therefore y; — x; 
as claimed. 

The observations in the last paragraph allow us to compute the inverse map ¢j, : A, > 
Ag. For, suppose 7* € A, is a path with k flaws. We can recover (2;,y;) since y; = k 
and y; — x; is the number of east steps of 7* departing from y = x. Next, we transfer the 
steps of z* to the top and bottom portions of ¢j,(7*) by reversing the process described 
earlier. Figure 12.4 gives an example where n = 10 and k = 5. First we find the special 
point (x;, y;) = (2,5). We start by transferring the initial steps A, B, C of z* to the part 
of the image path starting at (2,5) and moving northeast. Since C goes below the diagonal 
in 7*, we now switch to the bottom part of the image path. The special vertical step must 
be skipped, so we work southwest from (2,4). We transfer steps D, E, F, G, H. Since H 
returns to y = x in m*, we then switch back to the top part of the image path. We only 
get to transfer one step (step I) before returning to the bottom part of the image path. We 
transfer step J, then move back to the top part and transfer steps K through S. Finally, step 
T is transferred to become the special vertical step from (2,4) to (2,5). It can be checked 
that ¢j, is the two-sided inverse of @,, so dy : Ag > Ax is a bijection. 

We now know that |A;| = |Ao| for all k between 0 and n. Note that A is the disjoint 
union of the n+ 1 sets Ag, A1,..., An, all of which have cardinality |Ap|. By the Sum Rule, 


|A] = |Ao] + |Ai] +--+ [An] = (n + 1)/Aol- 


For each k between 0 and n, the Anagram Rule gives 


| A| 1 2n 
Ar A =C,.: 
|Ax| = |Ao Ae Pele C. oO 
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y=x 


FIGURE 12.4 
Mapping flawed paths to Dyck paths. 


In probabilistic language, the Chung—Feller Theorem can be stated as follows. 


12.5. Corollary. Suppose we pick a random lattice path a from the origin to (n,n). The 


number of flaws in this path is uniformly distributed on {0,1,2,...,n}. In other words, 
1 
P(flaw(7) = k) = for all k between 0 and n. 
n+1 


Proof. Given k with 0 < k <n, we compute 


_ lel wala) 1 A 
— AL (PR) nt 


nn 


P(flaw(7) = k) 


12.6. Remark. The Chung—Feller Theorem is significant in probability theory for the fol- 
lowing reason. One of the major theorems of probability is the Central Limit Theorem. 
Roughly speaking, this theorem says that the sum of a large number of independent, identi- 
cally distributed random variables (appropriately normalized) converges to a normal distri- 
bution. The normal distribution is described by the bell curve that appears ubiquitously in 
probability and statistics. One often deals with situations involving random variables that 
are not identically distributed and are not independent of one another. One might hope 
that a generalization of the Central Limit Theorem would still hold in such situations. 

Chung and Feller used the example of flawed lattice paths to show that such a general- 
ization is not always possible. Fix n > 0, and let the sample space S consist of all lattice 
paths from the origin to (n,n). Given a lattice path 7 € S, recall that 


flaw(7) = Xy(m) + Xo(7) +--+ + Xn(7), 


where Xj(7) is 1 if 7 has a flaw in row j and 0 otherwise. The random variables 
X1,X9,...,Xn are identically distributed; in fact, P(X; = 0) = 1/2 = P(X; = 1) for 
all j (see Exercise 12-7). But we have seen that the sum of these random variables, namely 
the flaw statistic X¥; + X2+---+ Xn, is uniformly distributed on {0,1,2,...,n} for every 
n. A uniform distribution is about as far as we can get from a normal distribution! The 
trouble is that the random variables X1,..., Xj, are not independent. 
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12.3. Rook-Equivalence of Ferrers Boards 


This section continues the investigation of rook theory begun in §2.14. We first define the 
concepts of Ferrers boards and rook polynomials. Then we derive a characterization of when 
two Ferrers boards have the same rook polynomial. 


12.7. Definition: Ferrers Boards and Rook Polynomials. Let pp = (u1 > pe >--: > 
ts > 0) be an integer partition of n. The Ferrers board F,, is a diagram consisting of s 
left-justified rows of squares with 4; squares in row 7. A non-attacking placement of k rooks 
on F,, is a subset of k squares in F,, such that no two squares lie in the same row or column. 
For all k € Zso, let rz (2) be the number of non-attacking placements of k rooks on F,. The 


rook polynomial of pu is 
n 


Ryle) = Dorel)’. 


k=0 


12.8. Example. If y = (4,1,1,1), then R,,(x) = 9x2+7xr+1. To see this, note that there is 
one empty subset of F),, which is a non-attacking placement of zero rooks. We can place one 
rook on any of the seven squares in F),, so the coefficient of x’ in R,,(z) is 7. To place two 
non-attacking rooks, we place one rook in the first column but not in the first row (three 
ways), and we place the second rook in the first row but not in the first column (three 
ways). The Product Rule gives 9 as the coefficient of x? in R,,(x). It is impossible to place 
three or more non-attacking rooks on F),, so all higher coefficients in R,,(x) are zero. 


As seen in the previous example, the constant term in any rook polynomial is 1, whereas 
the linear coefficient of a rook polynomial is the number || of squares on the board F;,. 
Furthermore, R,,(x) has degree at most min(/11, @(w)), since all rooks must be placed in 
distinct rows and columns of the board. 

It is possible for two different partitions to have the same rook polynomial. For example, 
it can be checked that 


Rea) (w) = 2a? + 4¢ +1 = Reg.a)(e) = Rea ry(a). 
More generally, R,,(x) = Ry (x) for any partition pu. 


12.9. Definition: Rook-Equivalence. We say that two integer partitions and v are 
rook-equivalent iff they have the same rook polynomial, which means rz(j1) = rz(v) for all 
integers k > 0. 


A necessary condition for ys and v to be rook equivalent is that || = |v|. The next 
theorem gives an easily tested necessary and sufficient criterion for deciding whether two 
partitions are rook-equivalent. 


12.10. Theorem: Rook-Equivalence of Ferrers Boards. Suppose p and v are parti- 
tions of n. Write uw = (f41 > ++: > wn) and v = (4% >... > vp) by adding zero parts if 
necessary. The rook polynomials R,,(z) and R,(x) are equal iff the multisets 


[ii t1,p2+2,..-,un+n] and [m4 +1,%24+2,...,m,4+7] 
are equal. 


Proof. The idea of the proof is to use modified versions of the rook polynomials that involve 
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linear combinations of the falling factorial polynomials («){,, instead of the monomials x” 
(see Definition 2.63). For any partition A, define the falling rook polynomial 


n n 


R3(2) = So rn—n(A)(@e= > tne (A)@(e — 1)(@ — 2) ++ (w@ —k +1). 


k=0 k=0 


Note that R,,(@) = R(x) iff r,(u) = rx (v) for all k between 0 and n (by linear independence 
of the monomial basis) iff Rii(x) = Rf(«x) (by linear independence of the falling factorial 
basis). We now prove that Rii(rz) = Rj(x) iff the multisets mentioned in the theorem are 
equal. 

First we use rook combinatorics to prove a factorization formula for Rj, («). Fix a positive 
integer x. Consider the extended board F,,(x), which has pu; + % squares in row i, for 
1<i<n. We obtain F,,(x) from the board F,, by adding x new squares on the left end 
of each of the n rows. Let us count the number of placements of n non-attacking rooks on 
F),(x). On one hand, we can build such a placement by working up the rows from bottom to 
top, placing a rook in a valid column of each successive row. For 7 > 0, when we place the 
rook in the (j + 1)th row from the bottom, there are x + 4, —; columns in this row, but we 
must avoid the j distinct columns that already have rooks in lower rows. By the Product 
Rule, the number of valid placements is 

n 
(© + pn)(@ + pena — Vo (+ on — (m1) = [] (oF [me — (a). 
i=1 


On the other hand, let us count the number of placements of n non-attacking rooks on 
F,,(x) that have exactly k rooks on the original board F,,. We can place these rooks first in 
rz (u) ways. The remaining n — k rooks must go in the remaining n — k unused rows in one 
of the leftmost x squares. Placing these rooks one at a time, we obtain r,(js)a(a — 1)(a@ — 
2)-+-(a — (mn —k-—1)) valid placements. Adding over & gives the identity 


nm n 


> re(u)(@n-v= [] (@ + lai — (rn - )). (12.1) 
k=0 i=1 
Replacing k by n — k in the summation, we find that 
n 
Ri(2) =] (e+ ls - (n- a). 
i=1 
This polynomial identity holds for infinitely many values of x (namely, for each positive 
integer x), so the identity must hold in the polynomial ring R[z]. Similarly, 
n 
Ri(x) = [] (e+ bs — (na). 
i=1 
The proof is now completed by invoking the uniqueness of prime factorizations for one- 
variable polynomials with real coefficients. More precisely, note that we have exhibited 
factorizations of Rj(x) and Rj(x) into products of linear factors. These two monic polyno- 
mials are equal iff their linear factors (counting multiplicities) are the same, which holds iff 
the multisets 


[mi —(n-t):1<i<n] and [y%-—(n-i):1<iK<nl] 


are the same. Adding n to everything, this is equivalent to the multiset equality in the 
theorem statement. O 
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12.11. Example. The partitions (2,2,0,0) and (3,1,0,0) are rook-equivalent, because 
[3,4,3,4] = [4,3,3,4]. The partitions (4,2,1) and (5,2) are not rook-equivalent, since 
(5, 4, 4, 4, 5, 6, 7 # (6, 4, 3, 4, 5, 6, tl: 
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12.4 Parking Functions 


This section defines combinatorial objects called parking functions. We then count parking 
functions using a probabilistic argument. 


12.12. Definition: Parking Functions. For n € Zyo, a parking function of order n is a 
function f : {1,2,...,n}— {1,2,...,n} such that for each i in the range 1 <i < n, there 
are at least i inputs x satisfying f(a) < i. 


12.13. Example. For n = 8, the function f defined by 


fl) =2, f(2) =6, f(3) =3, f(4) =2, f(5)=6, (6) =2, (7) =2, f(8) =1 


is a parking function. The function g defined by 


is not a parking function because g(x) < 4 is true for only three values of x, namely 
x = 3,6,8. 


Here is the reason these functions are called “parking functions.” Consider a one-way 
street with n parking spaces numbered 1,2,...,n. Cars numbered 1, 2,...,n arrive at the 
beginning of this street in numerical order. Each car wants to park in its own preferred 
spot on the street. We encode these parking preferences by a function h : {1,2,...,n} => 
{1,2,...,n}, by letting h(x) be the parking spot preferred by car x. Given h, the cars park 
in the following way. For « = 1,2,...,n, car x arrives and drives forward along the street 
to the spot h(x). If that spot is empty, car x parks there. Otherwise, the car continues to 
drive forward on the one-way street and parks in the first available spot after h(a), if any. 
The cars cannot return to the beginning of the street, so it is possible that not every car 
will be able to park. 

For example, suppose the parking preferences are given by the parking function f defined 
in Example 12.13. Car 1 arrives first and parks in spot 2. Cars 2 and 3 arrive and park in 
spots 6 and 3, respectively. When car 4 arrives, spots 2 and 3 are full, so car 4 parks in spot 
4. This process continues. At the end, every car has parked successfully, and the parking 
order is 8, 1,3, 4,6,2,5,7. Now suppose the parking preferences are given by the non-parking 
function g from Example 12.13. After the first six cars have arrived, the parking spots on 
the street are filled as follows: 

3,6,—-,—-,1,2,4,5. 


Car 7 arrives and drives to spot g(7) = 7. Since spots 7 and 8 are both full at this point, 
car 7 cannot park. 


12.14. Lemma. A function h: {1,2,...,n}— {1,2,...,n} is a parking function iff every 
car is able to park using the parking preferences determined by h. 


Proof. We prove the contrapositive in each direction. Suppose first that h is not a parking 
function. Then there exists i € {1,2,...,n} such that h(a) < i holds for fewer than i choices 
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of z. This means that fewer than i cars prefer to park in the first 2 spots. But then the first 
i spots cannot all be used, since a car never parks in a spot prior to the spot it prefers. 
Since there are n cars and n spots, the existence of an unused spot implies that not every 
car can park. 

Conversely, assume not every car can park. Let i be the earliest spot that is not taken 
after every car has attempted to park. Then no car preferred spot 7. Suppose 7 or more cars 
preferred the first i — 1 spots. Not all of these cars can park in the first i — 1 spots. But 
then one of these cars would have parked in spot 7, a contradiction. We conclude that fewer 
than 7 cars preferred one of the first ¢ spots, so that h(a) < i is true for fewer than i choices 
of x. This means that h is not a parking function. O 


12.15. The Parking Function Rule. For all n € Zso, there are (n + 1)"~! parking 
functions of order n. 


Proof. Fix n € Zso. Define a circular parking function of order n to be any function f : 
{1,2,...,n} > {1,2,...,n+1}. Let Z be the set of all such functions; by the Function Rule, 
|Z| = (n+ 1)". We interpret circular parking functions as follows. Imagine a roundabout 
(circular street) with n + 1 parking spots numbered 1,2,...,2+ 1. See Figure 12.5. As 
before, f encodes the parking preferences of n cars that wish to park on the roundabout. 
Thus, for l <x#<nandl<y<n+1, y= f(a) iff car x prefers to park in spot y. Cars 
1,2,...,n arrive at the roundabout in increasing order. Each car x enters just before spot 
1, then drives around to spot f(x) and parks there if possible. If spot f(a) is full, car x 
keeps driving around the roundabout and parks in the first empty spot that it encounters. 


spot 3 spot 2 


cars 


ROUNDABOUT 


eeee 


spot n+1 


FIGURE 12.5 
Parking on a roundabout. 


No matter what f is, every car succeeds in parking in the circular situation. Moreover, 
since there are now n+ 1 spots and only n cars, there is always one empty spot at the end. 
Suppose we randomly select a circular parking function. Because of the symmetry of the 
roundabout, each of the n + 1 parking spaces is equally likely to be the empty one. The 
fact that the entrance to the roundabout is at spot 1 is irrelevant here, since for parking 
purposes we may as well assume that car x enters the roundabout at its preferred spot f(x). 
Thus, the probability that spot k is empty is wT forl<k<n+l. 

On the other hand, spot n+ 1 is the empty spot iff f is a parking function of order n. 


For, if spot n + 1 is empty, then no car preferred spot n+ 1, and no car passed spot n + 1 
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during the parking process. In this case, the circular parking process on the roundabout 
coincides with the original parking process on the one-way street. The converse is established 
similarly. Since spot n + 1 is empty with probability 1/(n +1) and the sample space Z 
has size (n + 1)", we conclude that the number of ordinary parking functions must be 
|Z|/(n +1) =(n4+1)™-1. Oo 


12.16. Remark. Let A, be the set of circular parking functions of order n with empty 
spot k. The preceding proof shows that |An,,.| = (n+1)"~! for k between 1 and n + 1. 
We established this counting result by a probabilistic argument, using symmetry to deduce 
that P(An,,) = 1/(n+1) for all k. Readers bothered by the vague appeal to symmetry may 
prefer the following more rigorous argument. Suppose f € Ay,x, and kz are given. Let ¢(f) 
be the function sending 7 to (f(t) +k2—k 1) mod (n+1) for 1 <7 <n, taking the remainder 
to lie in the range {1,2,...,n +1}. Informally, ¢(f) rotates all of the parking preferences in 
f by ko — ky. One may check that ¢ is a bijection from A,,,, onto An,~,. These bijections 
prove that all the sets An, (for k between 1 and n+ 1) have the same cardinality. 


12.17. Remark. One of the original motivations for studying parking functions was their 
connection to hashing protocols. Computer programs often store information in a data 
structure called a hash table. We consider a simplified model where n items are to be stored 
in a linear array of n cells. A hash function h : {1,2,...,n} > {1,2,...,n} is used to 
determine where each item is stored. We store item 7 in position h(i), unless that position 
has already been taken by a previous item—this circumstance is called a collision. We 
handle collisions via the following collision resolution policy: if h(2) is full, we store item i in 
the earliest position after position i that is not yet full (if any). If there is no such position, 
the collision resolution fails (we do not allow wraparound). This scenario is exactly like that 
of the cars parking on a one-way street according to the preferences encoded by h. Thus, 
we can store all n items in the hash table iff h is a parking function. 


i 


12.5 Parking Functions and Trees 


This section uses parking functions (defined in §12.4) to give a bijective proof of the Tree 
Rule 3.71 for the number of n-vertex trees. The proof involves labeled lattice paths, which 
we now define. 


12.18. Definition: Labeled Lattice Paths. A labeled lattice path consists of a lattice 
path P from (0,0) to (a,b), together with a labeling of the b north steps of P with labels 
1,2,...,6 (each used exactly once) such that the labels for the north steps in a given column 
increase from bottom to top. 


We illustrate a labeled lattice path by drawing the path inside an (a + 1) x b grid of 
unit squares and placing the label of each north step in the unit square to the right of that 
north step. For example, Figure 12.6 displays a labeled lattice path ending at (5,7). 


12.19. Theorem: Labeled Paths. There are (a +1)? labeled lattice paths from (0,0) to 
(a, b). 


Proof. It suffices to construct a bijection between the set of labeled lattice paths ending at 
(a,b) and the set of all functions f : {1,2,...,b} > {1,2,...,a@+1}. Given a labeled lattice 
path P, define the associated function by setting f(¢) = 7 for all labels ¢ in column J of P. 

The inverse map acts as follows. Given a function f : {1,2,...,b} > {1,2,...,a+1]}, let 
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FIGURE 12.6 
A labeled lattice path. 


S; ={«: f(x) = 7} and s; =|S,| for 7 between 1 and a+ 1. The labeled path associated 
to f is the lattice path N*1EN*?E---N*°¢+? where the jth string of consecutive north steps 
is labeled by the elements of S; in increasing order. O 


12.20. Example. The function f associated to the labeled path P in Figure 12.6 is given 
by 

FQ) = 2, f(2)=4, f(3)=1, (4) =6, FS) =4, f(6)=4, f(7) = 1. 
Going the other way, the function g: {1,2,...,7} > {1,2,...,6} defined by 

g(1) = 2, g(2) =5, g(3) =4, g(4) = 2, g(5) =4, g(6) = 2, g(7) =2 


is mapped to the labeled lattice path shown in Figure 12.7. 


(a,b) 


(0,0) 


FIGURE 12.7 
Converting a function to a labeled path. 


A labeled Dyck path of order n is a Dyck path ending at (n,n) that is labeled according 
to the rules in Definition 12.18. For example, Figure 12.8 displays the 16 labeled Dyck paths 
of order 3. 


12.21. Theorem: Labeled Dyck Paths. For all n € Zs1, there are (n + 1)"~' labeled 
Dyck paths of order n. 


Additional Topics 527 


FIGURE 12.8 
Labeled Dyck paths. 


Proof. Using the bijection from the proof of Theorem 12.19, we can regard labeled lattice 
paths from (0,0) to (n,n) as functions f : {1,2,...,n} > {1,2,...,7 +1}. We first show 
that labeled paths that are not Dyck paths correspond to non-parking functions under this 
bijection. A labeled path P is not a Dyck path iff some east step of P goes from (i—1,7) to 
(i,j) for some 7 > j. This condition holds for P iff the function f associated to P satisfies 
{x : f(a) <7i}| = 9 for some i > j. In turn, this condition on f is equivalent to the existence 
of i such that |{a : f(a) < i}| <7. But this means that f is not a parking function (see 
Definition 12.12). We now see that labeled Dyck paths are in bijective correspondence with 
parking functions. So the result follows from Theorem 12.15. O 


12.22. Parking Function Proof of the Tree Rule. There are (n +1)"~! trees with 
vertex set {0,1,2,...,m}. 


Proof. Because of the previous result, it suffices to define bijections between the set B of 
labeled Dyck paths of order n and the set C of all trees with vertex set {0,1,2,...,n}. To 
define f : B + C, let P be a labeled Dyck path of order n. Let (a1, a2,...,an) be the 
sequence of labels in the diagram of P, reading from the bottom row to the top row, and 
set do = 0. Define a graph T = f(P) as follows. For 0 < j <n, there is an edge in T from 
vertex a; to each vertex whose label appears in column j + 1 of the diagram of P. These are 
all the edges of JT. Using the fact that P is a labeled Dyck path, one proves by induction 
on j that every a; is either 0 or appears to the left of column j + 1 in the diagram, so that 
every vertex in column j + 1 of P is reachable from vertex 0 in T. Thus, T = f(P) isa 
connected graph with n edges and n+ 1 vertices, so T is a tree by Theorem 3.70. 


12.23. Example. Figure 12.9 shows a parking function f, the labeled Dyck path P corre- 
sponding to f, and the tree T = f(P). We can use the figure to compute the edges of T by 


528 Combinatorics, Second Edition 


writing a; underneath column j + 1, for 0 < 7 < n. If we regard zero as the ancestor of all 
other vertices, then the labels in column j + 1 are the children of vertex a;. 


column 

x [£0 
1 4 

2 8 

3 5 

4 1 > 

5 4 

6 1 

7 1 

8 5 

FIGURE 12.9 


Mapping parking functions to labeled Dyck paths to trees. 


Continuing the proof, we define the inverse map f’: C + B. Let T € C be a tree with 
vertex set {0,1,2,...,n}. We generate the diagram for f’(T') by inserting labels into an 
n x n grid from bottom to top. Denote these labels by (a1,...,a@,), and set a9 = 0. The 
labels a1,@2,... in column 1 are the vertices of TJ’ adjacent to vertex aj = 0, written in 
increasing order from bottom to top. The labels in the second column are the neighbors of 
a, other than vertex 0. The labels in the third column are the neighbors of az not in the 
set {ao,a1}. In general, the labels in column j + 1 are the neighbors of a; not in the set 
{do, @1,..-,@;-1}, written in increasing order from bottom to top. Observe that we do not 
know the full sequence (a1,...,@n) in advance, but we reconstruct this sequence as we go 
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FIGURE 12.10 
Mapping trees to labeled Dyck paths to parking functions. 


12.24. Example. Figure 12.10 shows a tree T, the labeled Dyck path P = f(T), and the 
parking function associated to P. 
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Let us check that f’ is well-defined. We break up the computation of f’(T') into stages, 
where stage j consists of choosing the increasing sequence of labels a; < --- < ay that occur 
in column j. We claim that at each stage 7 with 1 < 7 < n+ 1, aj_1 has already been 
computed, so that the labels entered in column j occur in rows j or higher. This will show 
that the algorithm for computing f’ is well-defined and produces a labeled Dyck path. We 
proceed by induction on j. The claim holds for 7 = 1, since ag = 0 by definition. Assume 
that 1 <j <n+1 and that the claim holds for all j’ < j. To get a contradiction, assume 
that aj;—1 is not known when we reach column j. Since the claim holds for 7 — 1, we must 
have already recovered the labels in the set W = {a9 = 0, a1,...,a;—2}, which are precisely 
the labels that occur in the first 7 — 1 columns. Let z be a vertex of T not in W. Since T 
is a tree, there is a path from 0 to z in T’.. Let y be the earliest vertex on this path not in 
W, and let x be the vertex just before y on the path. By choice of y, we have x € W, so 
that « = 0 or = ax, for some k < j — 2. But if « = 0, then y occurs in column 1 and 
hence y € W. And if 2 = ax, then the algorithm for f’ would have placed y in column 
k+1<j-1, and again y € W. These contradictions show that the claim holds for 7. It is 
now routine to check that f’ is the two-sided inverse of f. O 


12.6 Mobius Inversion and Field Theory 


This section gives two applications of the material in §4.9 to field theory. We show that 
every finite subgroup of the multiplicative group of any field must be cyclic; and we count 
the number of irreducible polynomials of a given degree with coefficients in a given finite 
field. The starting point for proving the first result is the relation n = > d|n ¢(d), proved in 
Theorem 4.37. We begin by giving a combinatorial interpretation of this identity in terms 
of the orders of elements in a cyclic group of size n. 


12.25. Theorem: Order of Elements in a Cyclic Group. Suppose G is a cyclic group 
of size d < oo, written multiplicatively. If 2 € G generates G and c € Zs , then «°° generates 
a cyclic subgroup of G of size d/ gcd(c, d) = lem(c, d)/c. 


Proof. Since the cyclic subgroup generated by x* is a subset of the finite group G, the order 
of x° must be finite. Let k be the order of «°. We have seen in Example 7.60 that k is the 
smallest positive integer such that (x2°)* = 1g, and that the k elements 2°, (x°)? = 27°, ..., 
(2°)* = x*¢ are distinct and constitute the cyclic subgroup of G generated by x°. Since x 
has order d, we know from Example 7.60 that 2” = 1g iff d/m. It follows from this and 
the definition of k that kc is the least positive multiple of c that is also a multiple of d. 
In other words, kc = Icm(c,d). It follows that the order of x° is k = lem(c,d)/c. Since 
cd = lem(c, d) gcd(c, d), we also have k = d/ gcd(c, d). Oo 


12.26. Theorem: Number of Generators of a Cyclic Group. If G is a cyclic group 
of size d < ov, then there are exactly ¢(d) elements in G' that generate G. 


Proof. Let « be a fixed generator of G. By Example 7.60, the d distinct elements of G are 
x',a2?,...,27 =1g. By Theorem 12.25, the element x° generates all of G iff gcd(c, d) = 1. 


By Definition 4.12, the number of such integers c between 1 and d is precisely ¢(d). O 


12.27. Theorem: Subgroup Structure of Cyclic Groups. Let G be a cyclic group 
of size n < oo. For each d dividing n, there exists exactly one subgroup of G of size d, and 
this subgroup is cyclic. 


530 Combinatorics, Second Edition 


Proof. We only sketch the proof, which uses some results about group homomorphisms that 
were stated as exercises in Chapter 7. We know from Theorem 7.40 that every subgroup 
of the cyclic group Z has the form kZ for some unique k > 0, and is therefore cyclic. 
Next, any finite cyclic group G can be viewed as the quotient group Z/nZ for some n > 1. 
This follows by applying the Fundamental Homomorphism Theorem (Exercise 7-57) to the 
homomorphism from Z to G sending 1 to a generator of G. By the Correspondence Theorem 
(Exercise 7-61), each subgroup H of G has the form H = mZ/nZ for some subgroup mZ of 
Z containing nZ. Now, mZ contains nZ iff m|n, and in this case |mZ/nZ| = n/m. It follows 
that there is a bijection between the positive divisors of n and the subgroups of G. Each 
such subgroup is the homomorphic image of a cyclic group mZ, so each subgroup of G is 
cyclic. O 


Suppose G is cyclic of size n. For each d dividing n, let Gq be the unique (cyclic) 
subgroup of G of size d. On one hand, each of the n elements y of G generates exactly 
one of the subgroups Gy (namely, y generates the group Gq such that d is the order of y). 
On the other hand, we have shown that Gq has exactly $(d) generators. Invoking the Sum 
Rule, we obtain a new group-theoretic proof of the fact that n = Sain o(d). 


12.28. Theorem: Detecting Cyclic Groups. If G is a group of size n < oo such that 
for each d dividing n, G has at most one subgroup of size d, then G is cyclic. 


Proof. For each d dividing n, let Ty be the set of elements in G of order d. G is the disjoint 
union of the sets Ty by Theorem 7.100. Consider a fixed choice of d such that Ty is nonempty. 
Then G has an element of order d, hence has a cyclic subgroup of size d. By assumption, 
this is the only subgroup of G of size d, and we know this subgroup has ¢(d) generators. 
Therefore, |Ta| = ¢(d) whenever |Tqa| 4 0. We conclude that 


p= |Gl=> [Gls > 4 =n. 


d|n d|n 


Since the extreme ends of this calculation both equal n, the middle inequality here must 
in fact be an equality. This is only possible if every Ty is nonempty. In particular, T;, is 
nonempty. Therefore, G is cyclic, since it is generated by each of the elements in T),. O 


12.29. Theorem: Multiplicative Subgroups of Fields. Let F be any field, possibly 
infinite. If G is a finite subgroup of the multiplicative group of F’, then G is cyclic. 


Proof. Suppose G is an n-element subgroup of the multiplicative group of nonzero elements 
of F’, where n < oo. By Theorem 12.28, it suffices to show that G has at most one subgroup 
of size d, for each d dividing n. If not, let H and K be two distinct subgroups of G of size 
d. Then HU K is a set with at least d+ 1 elements; and for each z € HUK, it follows from 
Theorem 7.100 that z is a root of the polynomial x7 — 1 in F. But any polynomial of degree 
d over F' has at most d distinct roots in the field F’, by Exercise 12-72. This contradiction 
completes the proof. Oo 


Our next goal is to count irreducible polynomials of a given degree over a finite field. We 
shall assume a number of results from field theory, whose proofs may be found in Chapter 
V of the algebra text by Hungerford [65]. Let F' be a finite field with q elements. It is 
known that q must be a prime power, say g = p®, and F is uniquely determined (up to 
isomorphism) by its cardinality g. Every finite field F' with q = p® elements is a splitting 
field for the polynomial x? — x over Z/pZ. 
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12.30. Theorem: Counting Irreducible Polynomials. Let F be a field with g = p* 
elements. For each n € Zs1, let I(n,q) be the number of monic irreducible polynomials of 
degree n in the polynomial ring F'[a]. Then 


gq = a dI(d,q), andhence I(n,q) = 25 q’w(n/d). 


Proof. The strategy of the proof is to classify the elements in a finite field K of size q” based 
on their minimal polynomials. From field theory, we know that each element u € K is the 
root of a uniquely determined monic, irreducible polynomial in F'[a] (called the minimal 
polynomial of u over F). The degree d of this minimal polynomial is d = [F'(u) : F], where 
for any field extension E C H, [H : E] denotes the dimension of H viewed as a vector space 
over £. It is known that n = [K : F] = [K : F(u)]-[F(u) : F], so that djn. Conversely, 
given any divisor d of n, we claim that every irreducible polynomial of degree d in Fa] has 
d distinct roots in K. Sketch of proof: Suppose g is such a polynomial and z 4 0 is a root 
of g in a splitting field of g over K. Since z lies in F(z), which is a field with q@ elements, 
it follows from Theorem 7.100 (applied to the multiplicative group of the field F(z)) that 
21-1 = 1. It can be checked that q4 — 1 divides g” — 1 (since d\n), so that 2#”~! = 1, 
and hence z is a root of x?” — x. It follows that every root z of g must lie in K, which is 
a splitting field for 2?” — «. Furthermore, since z is a root of x" — 2, it follows that the 
minimal polynomial for z over F (namely g) divides 2?” — x in F [a]. We conclude that g 
divides 2?" — x in K[z] also. The polynomial x?" — x is known to split into a product. of g” 
distinct linear factors over K; in fact, 2?” — x = I1.,.ex(# — %o). By unique factorization 
in the polynomial ring K [a], g must also be a product of d distinct linear factors. This 
completes the proof of the claim. 

We can now write K as the disjoint union of sets R, indexed by all monic irreducible 
polynomials g in F'[z] whose degrees divide n, where R, consists of the deg(g) distinct roots 
of g in K. Invoking the Sum Rule and then grouping together terms indexed by polynomials 
of the same degree d, we obtain 


q’ =|K|= > |Rg| = > deg(g) = 5° dI(d,q). 
d|n 


monic irred. g€F [x]: monic irred. g€F [x]: 
deg(g)|n deg(g)|n 


We can now apply the Mébius Inversion Formula 4.33 to the functions f(n) = gq” and 
g(n) = nI(n, q) to obtain 


nI(n,q) = S> g4u(n/d). Oo 


d|n 


12.7 g-Binomial Coefficients and Subspaces 
Recall from Definition 8.30 that the q-binomial coefficients are polynomials in a formal 
variable q defined by the formula 


H — [nto iz. (¢' = 1) 


~ lee —Fe TY (a2 — De (@? —1) 


We gave several combinatorial interpretations of these polynomials in §8.6. In this section, 


we discuss a linear-algebraic interpretation of the integers Lila? where the variable q is 
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set equal to a prime power. To read this section, the reader should have some previous 
experience with fields and vector spaces. We begin by using bases to determine the possible 
sizes of vector spaces over finite fields. 


12.31. Theorem: Size of a Finite Vector Space. Suppose V is a d-dimensional vector 
space over a finite field F with q elements. Then |V| = q?. 


Proof. Let (v1,...,va) be an ordered basis for V. By definition of a basis, for each vu € V, 
there exists exactly one d-tuple of scalars (c,,...,¢a) € F% such that v = cv, + c2v2 + 
--» + cqvg. In other words, the function from F@ to V sending (c1,...,¢a) to ae Civ; isa 


bijection. Because |F| = q, the Product Rule gives |V| = |F4| = |F|¢ = q@. 


12.32. Theorem: Size of a Finite Field. If K is a finite field, then |K| = p® for some 
prime p and some e € Z>. 


Proof. Given a finite field K, let F be the cyclic subgroup of the additive group of Kk 
generated by 1x. The size of F is some finite number p (since K is finite), and p > 1 
since 1h 4 0x. We know that p is the smallest positive integer such that plx = 0xK. One 
checks (using the distributive laws) that F’ is not only an additive subgroup of K, but also 
a subring of Kk. If p were not prime, say p = ab with 1 < a,b < p, then (alx)- (blx) = 
ablx = plx = Ox, and yet alx,blx £0. This contradicts the fact that fields have no zero 
divisors. Thus, p must be prime. It now follows that F' is a field isomorphic to the field 
of integers modulo p. K can be regarded as a vector space over its subfield F’, by defining 
scalar multiplication (which is a map from F' x K into K) to be the restriction of the field 
multiplication m: K x K > K. Since K is finite, it must be a finite-dimensional vector 
space over F’. Thus the required result follows from Theorem 12.31. O 


12.33. Remark. One can show that, for every prime power p*°, there exists a finite field of 
size p©, which is unique up to isomorphism. The existence proof is sketched in Exercise 12-36. 


We now give the promised linear-algebraic interpretation of g-binomial coefficients. 


12.34. Theorem: Subspaces of Finite Vector Spaces. Let K be a finite field with q 
elements. For all integers k,n with 0 < k < n and all n-dimensional vector spaces V over 


K, the integer EP is the number of k-dimensional subspaces of V. 


Proof. Let f(n,k,q) be the number of k-dimensional subspaces of V. One may check that 
this number depends only on k, g, and n = dim(V). Recall from Theorem 12.31 that 
|V| = q” and each d-dimensional subspace of V has size g?. By rearranging factors in the 
defining formula for [7] qo We see that [7] = f(n,k,q@) holds iff 


f(n, k, a)(q* — 1)(q** — 1)--- (g’ — 1) = (q® — 1)(Q™"* - 1)--- (qh ®t - 1). 


We establish this equality by the following counting argument. Let S be the set of all 
ordered lists (v1,...,v%) of & linearly independent vectors in V. Here is one way to build 
such a list. First, choose a nonzero vector v; € V in any of gq” — 1 ways. This vector spans 
a one-dimensional subspace W of V of size g = q!. Second, choose a vector v2 € V—W, in 
any of g” —q ways. The list (v1, v2) must be linearly independent since v2 is not in the space 
W, spanned by v,. The vectors v; and v2 span a two-dimensional subspace W2 of V of size 
q’. Third, choose v3 € V—W2 in q” — q? ways. Continue similarly. When choosing v;, we 
have already found z — 1 linearly independent vectors v1,...,v;—1 that span a subspace of 
V of size q’~!. Consequently, (v1,...,v;) is linearly independent iff we choose v; € V—W,, 
which is a set of size q” — q'—!. By the Product Rule, we conclude that 
k k 
[S| = [[@? -— 8") = [a tat - 1) = gh PP @ - 1g" -1)-- rt - 1). 


i=l i=l 
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Now we count S$ in a different way. Observe that the vectors in each list (v1,...,vU%) € S$ 
span some k-dimensional subspace of V. So we begin by choosing such a subspace W in any 
of f(n,k,q) ways. Next we choose v1,...,v~% € W one at a time, following the same process 
used in the first part of the proof. We can choose v, in |W| — 1 = q* — 1 ways, then v2 in 
q* — q ways, and so on. By the Product Rule, 


k 
[S| = f(n,k,@) [[(a* — a *) = F(n, k, ah 9g — 1g"! - 1)---(@ - 1). 


i=1 
Equating the two formulas for || and cancelling q*(*—)/? gives the required result. oO 


In Theorem 8.34, we saw that 


pweP(k,n—k) 


where P(k,n—k) is the set of all integer partitions jz: that fit in a k x (n —k) rectangle. In 
the rest of this section, we give a second proof of Theorem 12.34 by showing that 


faka= do ltl. 


pe P(k,n—k) 


This proof is longer than the one already given, but it reveals a close connection between 
the enumeration of subspaces on one hand, and the enumeration of partitions in a box (or, 
equivalently, lattice paths) on the other hand. 

For definiteness, we work with the vector space V = K” whose elements are n-tuples of 
elements of K. We regard elements of V as row vectors of length n. The key linear-algebraic 
fact we need is that every k-dimensional subspace of V = K” has a unique reduced row- 
echelon form basis, as defined below. 


12.35. Definition: Reduced Row-Echelon Form. Let A be a k x n matrix with entries 
in kK. Let Ai,..., Ax, € K” be the k rows of A. We say A is a reduced row-echelon form 
(RREF) matrix iff the following conditions hold: (i) A; # 0 for all 7, and the leftmost 
nonzero entry of A; is 1x (call these entries leading 1’s); (ii) if the leading 1 of A; occurs in 
column j(i), then j(1) < j(2) <--- < j(k); (iii) every leading 1 is the only nonzero entry in 
its column. An ordered basis B = (v1,..., vx) for a k-dimensional subspace of K” is called 
a RREF basis iff the matrix whose rows are vj,...,Ux is a RREF matrix. 


12.36. Theorem: RREF Bases. Let K be any field. Every k-dimensional subspace of 
kK” has a unique RREF basis. Conversely, the rows of every k x n RREF matrix comprise 
an ordered basis for a k-dimensional subspace of AK”. Consequently, there is a bijection 
between the set of such subspaces and the set of k x n RREF matrices with entries in K. 


Proof. We sketch the proof, asking the reader to supply the missing linear algebra details. 

Step 1: We use row-reduction to show that any given k-dimensional subspace W of kK” 
has at least one RREF basis. Start with any ordered basis v1,...,U~ of W, and let A be 
the matrix with rows vj,...,vUx. There are three elementary row operations we can use to 
simplify A: interchange two rows; multiply one row by a nonzero scalar (element of Kk); 
add any scalar multiple of one row to a different row. A routine verification shows that 
performing any one of these operations has no effect on the subspace spanned by the rows 
of A. Therefore, we can create new ordered bases for W by performing sequences of row 
operations on A. Using the well-known Gaussian elimination algorithm (also called “row 


534 Combinatorics, Second Edition 


reduction” ), we can bring the matrix A into reduced row-echelon form. The rows of the new 
matrix give the required RREF basis of W. 

Step 2: We show that a given subspace W has at most one RREF basis. Use induction on 
k, the base case k = 0 being immediate. For the induction step, assume n > 1 and k > 1 are 
fixed, and the uniqueness result is known for smaller values of k. Let A and B be two RREF 
matrices whose rows form bases of W; we must prove A = B. Let j(1) < j(2) < +--+ < j(k) 
be the positions of the leading 1’s in A, and let r(1) < r(2) < --- < r(k) be the positions of 
the leading 1’s in B. If j(1) < r(1), then the first row of A (which is a vector in W) has a 1 
in position j(1). This vector cannot possibly be a linear combination of the rows of B, all of 
whose nonzero entries occur in columns after j(1). Thus, j(1) < r(1) is impossible. Similar 
reasoning rules out r(1) < j(1), so we must have j(1) = r(1). Let W’ be the subspace of 
W consisting of vectors with zeroes in positions 1,2,...,7(1). Consideration of leading 1’s 
shows that rows 2 through & of A must form a basis for W’, and rows 2 through k& of B 
also form a basis for W’. Since dim(W’) = k — 1, the induction hypothesis implies that 
rows 2 through k of A equal the corresponding rows of B. In particular, we now know that 
r(t) = j(4) for all i between 1 and k. To finish, we must still check that row 1 of A equals row 
1 of B. Let the rows of B be wj,...,w,, and write v, for the first row of A. Since v,; € W, 
we have v1 = ajw; +--+ +a, for some unique scalars az. Consideration of column j(1) 
shows that a; = 1. On the other hand, if a; 4 0 for some i > 1, then ayw , +--+: + agwe 
would have a nonzero entry in position 7(i), whereas v; has a zero entry in this position 
(since the leading 1’s occur in the same columns in A and B). This is a contradiction, so 
ag =++:=az = 0. Thus vj = wv, as needed, and we have now proved that A = B. 

Step 3: We show that the k rows v1,...,vzx of a given RREF matrix form an ordered 
basis for some k-dimensional subspace of kK”. It suffices to show that the rows in question 
are linearly independent vectors. Suppose cyv; +--+: + chug, = 0, where c; € K. Recall that 
the leading 1 in position (7, 7(i)) is the only nonzero entry in its column. Therefore, taking 
the j(i)th component of the preceding equation, we get c; = 0 for alli between 1 andk. O 


Because of the preceding theorem, the problem of counting k-dimensional subspaces of 
K” (where |K| = q) reduces to the problem of counting k x n RREF matrices with entries 
in K. Our second proof of Theorem 12.34 is therefore complete once we prove the following 
result. 


12.37. Theorem: RREF Matrices. Let K be a finite field with g elements. The number 
of k x n RREF matrices with entries in K is 


ie oo El, 


pe P(k.n—-k 


Proof. Let us classify the k x n RREF matrices based on the columns j(1) < j(2) <-+:< 
j(k) where the leading 1’s occur. To build a RREF matrix with the leading 1’s in these 
positions, we must put 0’s in all matrix positions (i,p) such that p < j(t); we must also 
put 0’s in all matrix positions (r,j(7)) such that r < 7. However, in all the other positions 
to the right of the leading 1’s, there is no restriction on the elements that occur except 
that they must come from the field K of size g. How many such free positions are there? 
The first row contains n — j(1) entries after the leading 1, but & — 1 of these entries are 
in columns above other leading 1’s. So there are 4; = n — j(1) — (k — 1) free positions in 
this row. The next row contains n — j(2) entries after the leading 1, but & — 2 of these 
occur in columns above other leading 1’s. So there are 2 = n — j(2) — (k— 2) free positions 
in row 2. Similarly, there are uy; = n — j(i) — (kK —1) = n—k +i — j(2) free positions 
in row 7 for 7 between 1 and k. The condition 1 < j(1) < j(2) < -:: < j(k) < nis 
logically equivalent to 0 < j(1)-—1 < j(2)-—2<.--+- < j(k) —k < n—k, which is in turn 
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equivalent ton —k > wy > po > ++: > we = O. Thus, there is a bijection between the 
set of valid positions j(1) < j(2) <... < j(k) for the leading 1’s, and the set of integer 
partitions = ({41,..-, Un) whose diagrams fit in ak x (n—k) box; this bijection is given by 
by, =n—k+i— (2). Furthermore, |u| = w1 +---+ wx is the total number of free positions 
in each RREF matrix with leading 1’s in the positions j(i). Using the Product Rule to fill 
these free positions one at a time with elements of K, we see that there are q'“! RREF 
matrices with leading 1’s in the given positions. The theorem now follows from the Sum 
Rule, keeping in mind the bijection just constructed between the set of possible j-sequences 
and the set P(k,n — k). Oo 


12.38. Example. To illustrate the preceding proof, take n = 10 and k = 4, and consider 
RREF matrices of the form 


ooCcocCceo 
ooCcoCorF 
ooo * 
ooo * 
ooroe 
oOoroeo°9e 
ox x * 
ox x * 
rFPooe;f 
x * * *¥ 


The stars mark the free positions in the matrix, and (j(1), 7(2), 7(3), 7(4)) = (2,5, 6,9). The 
associated partition is u = (5,3,3,1), which does fit in a 4 x 6 box. We can see a reflected 
version of the diagram of jz in the matrix by erasing the columns without stars and right- 
justifying the remaining columns. We see that there are g!? = q'“! ways of filling in this 
template to get an RREF matrix with the leading 1’s in the indicated positions. 

Going the other way, consider another partition uw = (6,2,2,0) that fits in a 4 x 6 box. 
Using the formula j(i) = n —k +4 — pu, we recover (j(1),7(2), 7(3),7(4)) = (1,6, 7, 10), 
which tells us the locations of the leading 1’s. So this particular partition corresponds to 
RREF matrices that match the following template: 


1 «x *« *x* *« 0 0 «x «x O 
000 001 0 * « O 
0 00 00 0 1 * * +O 
000 0 00 00 0 1 


12.8 Tangent and Secant Numbers 


In calculus, one learns the following power series expansions for the trigonometric functions 
sine, cosine, and arctangent: 


ge ge gl oo kt 
1 =nr-— sp ee, = —1 7 
ee ae a 2 QraD! 
4 oo 2k 
_ x x x = ah” 
cose =1— p+ a — at = du OR 
ge ge gt oo gtk+ 
t =f-—+—-sH+:= -1 . 
ee a ee 2 e+ 
oo (k : 
These expansions are all special cases of Taylor’s formula f(x) = S°;~ LO ak, Using 


Taylor’s formula, one can also find power series expansions for the tangent and secant 
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functions: 


ta. ie aa. 2 By 17 on 62 94 1382 gia 
nx=2+=-2 =x —r —r ——_— aeane 
3 15 315 2835 155925 i 


secz =1+ ae? gt a + eae + Se 
2 24 720 8064 3628800 
The coefficients of these series seem quite irregular and unpredictable compared to the 
preceding three series. Remarkably, as we shall see in this section, these coefficients encode 
the solution to a counting problem involving permutations. 

It can be shown that the power series expansions for tanz and secx given by Taylor’s 
formula do converge for all x in a neighborhood of 0. Furthermore, it is permissible to 
compute coefficients in these power series by algebraic manipulation of the identities tan x = 
sinz/cosx and seca = 1/ cos, using the series expansions for sinxz and cosx given above. 
We could avoid worrying about these technical points by working with formal power series 
throughout. 

For each n € Zso, let an be the nth derivative of tanz evaluated at x = 0, and let by, 
be the nth derivative of sec x evaluated at « = 0. By Taylor’s formula, we know 


Co co 


an n bn n 
tang = - me: sec c= > mle (12.2) 
n=0 n=0 
The first several values of a, and b, are 
(ao, @1,42,-.-) = (0,1,0,2,0, 16,0, 272, 0, 7936, 0, 353792, ...); (12.3) 
(bo, b1, b2,...) = (1,0,1,0,5,0,61, 0, 1385, 0, 50521,...). 


Since tan is an odd function and sec is an even function, it readily follows that a, = 0 
for all even n and 6b, = 0 for all odd n. 

Next, for each integer n > 1, let c, be the number of permutations w = w,w2--: Wy of 
{1,2,...,n} (or any fixed n-letter ordered alphabet) such that 


Wy < We > wW3 <W4 > +++ < Wn_1 > Wn. (12.4) 


Note that c, = 0 for all even n > 0; we also define cp = 0. For each integer n > 0, let dy, 
be the number of permutations w of {1,2,...,n} (or any fixed n-letter ordered alphabet) 
such that 

Wy < We > w3 < Wa > +++ > Wn-1 < Wn} (12.5) 


note that dy = 1 and d,, = 0 for all odd n. By reversing the ordering of the letters, one sees 
that d, also counts permutations w of n letters such that 


wy > We < wW3 > Wa < +++ << Wyn_-1 > Wn. (12.6) 


Permutations of the form (12.4) or (12.5) are called up-down permutations. We now prove 
that an = Cy, and b, =d,, for all integers n > 0. The proof consists of five steps. 

Step 1. Differentiating tan z = sin a/ cosa, one readily calculates that = (tan x) = sec” x. 
Differentiating the first series in (12.2), squaring the second series using Definition 5.15(e), 
and equating the coefficients of x”, we obtain 


An+1 = ” be : On—k 
n! kl! (n—k)! 
k=0 
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or equivalently, 
nm 


>> (;) bebn-z for alln >O. (12-7) 


k=0 


Step 2. Differentiating sec x = 1/ cosa, we find that ~4(sec x) = tan x sec x. Differentiat- 
ing the second series in (12.2), multiplying the two series together using Definition 5.15(e), 
and equating the coefficients of x”, we obtain 


n 


bn+1 _ ak bn—k 
n! ram k! (n—k)! 
or equivalently, 
t= S @ azbn—-k for alln>0. (12.8) 
k=0 


Step 3. We give a counting argument to prove the equation 


n 


Cri =>~ (;) dydn—~ for alln >0. (12.9) 
k=0 


If n is odd, then both sides of this equation are zero, since at least one of k or n —k is odd 
for each k. Now suppose n is even. How can we build a typical permutation 


W = Wy < W2 > W3 <9 > Wnt 


counted by c,+1? Let us first choose the position of 1 in w; say wr41 = 1 for some k between 
0 and n. The required inequalities at position k + 1 are satisfied if and only if k is even. 
In the case where k is odd, we know dyd,_~ = 0, so this term contributes nothing to the 
right side of (12.9). Given that k is even, choose a k-element subset A of the n remaining 
letters in ) ways. Use these k letters to fill in the first k positions of w, subject to the 
required inequalities (12.5), in any of dy ways. Use the remaining letters to fill in the last 
(n+1)—(k+1) =n-—k positions of w (subject to the inequalities (12.6), reindexed to 
begin at index k +2), in any of d,_, ways. Equation (12.9) now follows from the Sum Rule 
and Product Rule. See Figure 12.11, in which w is visualized as a sequence of line segments 
connecting the points (i, w;) for 7 between 1 and n + 1. 
Step 4. We give a counting argument to prove the equation 


dni = )- (;) Cedn_p for alln > 0. (12.10) 
k=0 


Both sides are zero if n is even. If n is odd, we must build a permutation 
W = Wy < W2 > W3 <0 S Wn. 


First choose an index k with 0 < k < n, and define wz4, = n+1. This time, to get a nonzero 
contribution from this value of k, we need k to be odd. Now pick a k-element subset A of 
the n remaining letters in es) ways. Use the letters in A to fill in wiwe--- wr (cK ways), 
and use the remaining letters to fill in we42-++Wn+1 (dn—z~ ways). See Figure 12.12. 

Step 5: A routine induction argument now shows that a, = c, and b, = d, for all 
n € Zso, since the pair of sequences (a), (bn) satisfy the same system of recursions and 
initial conditions as the pair of sequences (c,,), (dy). This completes the proof. 
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first k letters 1 last n—k letters 


FIGURE 12.11 
Counting up-down permutations of odd length. 


n+1 


first k letters Stee eee tere ese 


pone last n—k letters 


FIGURE 12.12 
Counting up-down permutations of even length. 
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DT 


12.9 Combinatorial Definition of Determinants 


This section applies material from Chapter 7 to the theory of determinants. Our goal is to 
explain how the combinatorial properties of permutations underlie many basic facts about 
determinants. First we recall the definitions of operations on matrices. 


12.39. Definition: Matrix Operations. For each positive integer n, let M,,(R) be the 
set of n x n matrices with entries in R. Given A € M,,(R), we denote the entry in row 3, 
column j of A by A(i,j). For A,B € M,,(IR) and c € R, define A+ B, AB, and cA by 
setting 


(A+ B)i,j) = Ali,j) + BG, 3); 
(AB)(i,3) = S > Ai, k)B(k, 3); 
k=1 
(cA)(i,7) = c(A(i,7)) for all i,j € {1,2,...,n}. 


Routine verifications show that M,,(R) with these operations is a ring; the multiplicative 
identity element J, in M,,(R) is given by [,(i,j) = 1 if i = 7, and I, (i,j) = 0 if i F j. 
This ring is non-commutative for all n > 1. More generally, we could replace R by any 
commutative ring R throughout the following discussion, considering matrices with entries 
in Rf. 


12.40. Definition: Determinants. For a matrix A € M,,(R), the determinant of A is 


n 


det(A) = S~ sgn(w) [] AG, w(i)). 


wWESn i=1 


In this definition, S,, is the set of all permutations of {1,2,...,n}, and sgn(w) = 
(—1)i"™(™) for w € Sp, (see §7.4). Note that det(A) is an element of R. 


12.41. Example. When n = 1, det(A) = A(1,1). When n = 2, the possible permutations 


w (in one-line form) are w = 12 with sgn(w) = +1, and w = 21 with sgn(w) = —1. 
Therefore, the definition gives 
= A(Q1,1) A(1,2) | _ 
det(A) = det A(2,1) A(2,2) | = +A(1,1)A(2,2) — A(1, 2)A(2, 1). 


When n = 3, we find (using the table in Example 7.23) that 


A(1,1) A(1,2) A(1,3) 
det(A) = ww [aan A(2,2) A(2,3) 
A(3,1) A(3,2) A(3,3) 
= +A(1,1)A(2,2)A(3, 3) — A(1,1)A(2, 3)A(3, 2) — A(1, 2) A(2, 1) A(3, 3) 
+A(1,2)A(2,3)A(3, 1) + A(1, 3) A(2, 1)A(3, 2) — A(1,3)A(2, 2) A(3, 1). 


For general n, we see that det(A) is a sum of n! signed terms. A given term arises by 
choosing one factor A(i,w(i)) from each row of A; since w is a permutation, each of the 
chosen factors must come from a different column of A. The term in question is the product 
of the n chosen factors, times sgn(w). Since sen(w) = (—1)'""™), the sign attached to this 
term depends on the parity of the number of basic transpositions needed to sort the column 
indices w(1), w(2),...,w(m) into increasing order (see Theorem 7.29). 
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The next result shows that we can replace A(i, w(7)) by A(w(2),7) in the defining for- 
mula for det(A). This corresponds to interchanging the roles of rows and columns in the 
description above. 


12.42. Definition: Transpose of a Matrix. Given A € M,,(R), the transpose of A is 
the matrix A € M,,(R) such that A" (i, 7) = A(j,i) for all 7,7 € {1,2,...,n}. 


12.43. Theorem: Determinant of a Transpose. For all A € M,,(R), det(A™) = det(A). 
Proof. By definition, 


n 


det(A™) = S~ sgn(w) [] A™(&, w(k)) = S > sgn(w) [] A(w(h), ). 
k=1 


wesn k=1 WESn 


For a fixed w € S;,, we make a change of variables in the product indexed by w by letting 
j =w(k), sok = w'(j). Since w is a permutation and multiplication in R is commutative, 


][ 4@@) *) = [[ 46,477 @) 
k=1 j=l 


because the second product contains the same factors as the first product in a different 
order. Using Theorem 7.31, we now calculate 


det(A") = D> sen(w) [T AG,w?@)) = YD senw-) J] AG w 1G). 


weSn weSn 


Now consider the change of variable v = w~!. As w ranges over S;,, so does v, since the 
map sending w to w7! is a bijection on S;,. Furthermore, we can reorder the terms of the 
sum since addition in R is commutative. We conclude that 


det(A™) = S© sgn(v) II A(j,v(j)) = det(A). Oo 


vESn 
Next we obtain a formula for the determinant of an upper-triangular matrix. 


12.44. Theorem: Determinants of Triangular and Diagonal Matrices. Suppose 
A € M,,(R) is upper-triangular, which means A(i,7) = 0 whenever 7 > 7. Then det(A) = 
IU}, AG, 1). Consequently, if A is either upper-triangular, lower-triangular, or diagonal, 
then det(A) is the product of the diagonal entries of A. 


Proof. By definition, det(A) = 0,,¢9, sgn(w) []i_, A(é, w(i)). In order for a given sum- 
mand to be nonzero, we must have i < w(i) for all i € {1,2,...,n}. Since w is a permu- 
tation, we successively deduce that w(n) = n, w(n-1) =n—-1,..., w(1) = 1. Thus, the 
only summand that could be nonzero is the one indexed by w = id. Since sgn(id) = +1 
and id(¢) = i for all 2, the stated formula for det(A) follows when A is upper-triangular. 
The result for lower-triangular A follows by considering A‘. Since diagonal matrices are 
upper-triangular, the proof is complete. oO 


12.45. Corollary: Determinant of Identity Matrix. For all n > 0, det(J,,) = 1. 


Next we discuss the multilinearity property of determinants and some of its conse- 
quences. 


12.46. Definition: Linear Maps. For n € Zyo, a function T : R” > R is called a linear 
map iff T(v +z) = T(v) + T(z) and T(cv) = cT(v) for all v,z € R” and all c ER. 
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12.47. Example. Suppose 6),...,0, € R are fixed constants, and T : R” — R is defined 
by 

T(v1,.--,Un) = b1U1 + bgvg +++++bnUn for all (v1,...,Un) € R”. 
It is routine to check that T is a linear map. Conversely, one can show that every linear 
map from R” to R must be of this form for some (unique) choice of b;,..., bn € R. 


12.48. Theorem: Multilinearity of Determinants. Let A € M,,(R), and let k bea 
fixed row index in {1,...,n}. Given a row vector v € R”, let A[v] denote the matrix A 
with row k replaced by v. Then T : R” + R given by T(v) = det(Al[v]) is a linear map. 
The same result holds with “row” replaced by “column” everywhere. 

Proof. By Example 12.47, it suffices to show that there exist constants b),...,b, € R such 
that for all v = (v1, v2,...,Un) € R", 


T(v) = byvy + bavg + +++ + bpp. (12.11) 
To prove this, consider the defining formula for det(Al[v]): 


n 


T(v) =det(A[v]) = S> sgn(w) ]] Alv]@,w(@)) = 55 sgn) | T] AG w(@))} ow: 
weSn i=1 wesSn ok 


The bracketed expressions depend only on the fixed matrix A, not on v. So (12.11) holds 
with 


b; = ae sgn(w) II A(i, w(i)) for 7 between 1 and n. (12.12) 
WES: t=1 
w(k)=J veh 


To obtain the analogous fact for the columns of A, apply the result just proved to A™. O 


We sometimes use the following notation when invoking the multilinearity of determi- 
nants. For A € M,,(R), let Ai, Ag,..., A, denote the n rows of A; thus each A; lies in R”. 
We write det(A) = det(Ai,..., An), viewing the determinant as a function of n inputs (each 
of which is a row vector). The previous result says if we let the kth input vary while holding 
the other n—1 inputs fixed, the resulting function sending v € R” to det(Aj,...,v,..., An) 
is a linear map from R” to R. 


12.49. Theorem: Alternating Property of Determinants. If A € M,,(R) has two 
equal rows or two equal columns, then det(A) = 0. 


Proof. Recall det(A) is a sum of n! signed terms of the form T(w) = sgn(w) []/_, A(i, w(é)), 
where w ranges over S,,. Suppose rows r and s of A are equal, so A(r,k) = A(s,k) for all 
k. We define an involution J on S,, with no fixed points such that T([(w)) = —T(w) 
for all w € S,. It follows that the n! terms cancel in pairs, so that det(A) = 0. Define 
I(w) = wo(r,s) for w € S;,. We see at once that Io I = ids, and J has no fixed points. 
On one hand, sgn(J(w)) = sgn(w) - sgn((r, s)) = —sgn(w) by Theorem 7.31. On the other 
hand, 


[[4G we s]@) = Alr,w(s))A(s,w(r)) TT AG w(@) 


i=l iA7r,s 
= A(s,w(s))A(r, w(r)) II A(i, w(t)) = [ [4G w@). 
iZr 8 w=1 


Combining these facts, we see that T(I(w)) = —T(w), as needed. Now if B has two equal 
columns, then B™ has two equal rows, so det(B) = det(B'") = 0. oO 
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The next theorem shows how elementary row operations (used in the Gaussian elimina- 
tion algorithm) affect the determinant of a square matrix. 


12.50. Theorem: Elementary Row Operations and Determinants. Suppose A, B € 
M,,(R), j,k € {1,2,...,n}, 9 4k, andceER. 

(a) If B is obtained from A by multiplying row 7 by c, then det(B) = cdet(A). 

(b) If B is obtained from A by interchanging rows j and k, then det(B) = — det(A). 

(c) If B is obtained from A by adding c times row j to row k, then det(B) = det(A). 
Analogous results hold with “row” replaced by “column” everywhere. 


Proof. Part (a) is a special case of the multilinearity of determinants (see Theorem 12.48). 
Part (b) is a consequence of multilinearity and the alternating property. Specifically, define 
T:R" xR" > R by letting T(v, w) = det(Ai,...,v,...,W,---, An), where the row vectors 
v and w occur in positions 7 and k. Since det is multilinear and alternating, we have for 
each v,w € R”: 


0=T(v+w,v+w) =T(v,v)+T7T(w,v)+T7(v,w) + T(w,w) =T(w,v)+T(v,w). 


Thus, T(w,v) = —T(v,w), which reduces to statement (b) upon choosing v = A; and 
w = Ax. Part (c) follows for similar reasons, since 


T(v,cv +-w) =cT(v,v)+T(v,w) =T(v,w). Oo 
Our next result is a recursive formula for computing determinants. 


12.51. Theorem: Laplace Expansions of Determinants. For A € M,,(R) and all 
i,j € {1,2,...,n}, let A[z|j] be the matrix in M,,_1(R) obtained by deleting row i and 
column j of A. For each fixed & in the range 1 <k <n, 


det(A) = So(-DtF AGG, k) det(A[i|k]) (expansion along column k) 
i=1 
— So (-1)*** A(R, J) det(A[k|j]) (expansion along row k). 


j=1 


Proof. We first prove the expansion formula along row k = n. By the proof of multilinearity 
(see Equations (12.11) and (12.12)), we know that 


det(A) = b, A(n, 1) + bg A(n, 2) +--+ +b, A(n,n) 


where ‘ 
bj = » sgn(w) i A(i,w(z)) for 7 between 1 and n. 
WES: i=l 


To prove the formula in the theorem, it is enough to show that b; = (—1)4*” det(A[nJ|j]) 
for all 7 between 1 and n. 

Fix an index j. Let S,,,; ={w € 5, : w(n) = j}. We define a bijection f : Sn; 4 Sn—1 
as follows. Every w € S,,; can be written in one-line form as w = wyw2--*Wn—1Wn where 
Wn = Jj. Define f(w) = wiwh---wh_, where w, = uw, if wu, < j, and uw, = uw, — 1 if 
w; > j. In other words, we drop the j at the end of w and subtract 1 from all letters 
larger than j. The inverse map acts on w’ € S;,1 by adding 1 to all letters in w’ weakly 
exceeding j and then putting a 7 at the end of the one-line form. Observe that the deletion 
of 7 decreases inv(w) by nm — 7 (which is the number of letters to the left of 7 that are 
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greater than j), and the subtraction operation has no further effect on the inversion count. 
So, inv(f(w)) = inv(w) — (n — j) and sgn(f(w)) = (—1)4*" sgn(w). We also note that for 
w’ = f(w), we have A(i, w(i)) = Aln|j](z, w’(z)) for all i < n, since all columns in A after 
column j get shifted left one column when column 7 is deleted. Now we use the bijection f 
to change the summation variable in the formula for b;. Writing w’ = f(w), we obtain 


y= agn(w) [| Alé,w(@)) 
weSn,j v1 
= SO (1) sgn(w TT Atri i, w'(i)) = (—1)?*" det(A[nJj]). 


w'ESn-1 


The expansion along an arbitrary row k follows from the special case k = n. Given k < n, 
let B be the matrix obtained from A by interchanging rows k and k + 1, then rows k + 1 
and k + 2, and so on, until the original row k has reached the last row of the matrix. The 
procedure converting A to B involves n — k row interchanges, so det(B) = (—1)"~* det(A). 
Moreover, B(n,7j) = A(k,7j) and B[n|j] = A[k|j] for 7 between 1 and n. So 


n 


det(A) = (—1)*" det(B) = (—1)*" S1(- 1)" B(n, j) det (Bini) 
= 2-1 1)7** A(k, j) det(A[k|j]). 


Finally, to derive the expansion along column k, we transpose the matrix. We see from the 
definitions that A‘[k|j] = Alj|k]", and this matrix has the same determinant as A[j|k]. 
Therefore, 


n 


det(A) = det(A%) = }0(-1)*FA"(k, 7) det(A"[&lj]) 
= S((-1)**A(j, k) det(A[j|k]). O 


j=l 

Next we show how the Laplace expansions for det(A) lead to explicit formulas for the 
entries in the inverse of a matrix. 
12.52. Definition: Classical Adjoint of a Matrix. Given A € M,,(R), let adj A € 
M,,(R) be the matrix with i, j-entry (—1)’*? det(A[j|i]) for i,7 € {1,2,...,n}. 

The next result explains why A[j|i] appears instead of A[7|j] in the preceding definition. 
12.53. Theorem: Adjoint Formula. For all A € M,,(R), we have 

A(adj A) = det(A)I, = (adj A)A. 

Proof. For i between 1 and n, the i, i-entry of the product A(adj A) is 


n 


yal A(i, k)[adj A](k, i) = $°(—1)'T* A(i, k) det(Ali|k]) = det(A), 
k=1 
by Laplace expansion along row i of A. Now suppose i 4 j. The i, j-entry of A(adj A) is 


n 


Yo Alik [adj A](k, 7) = S°(-1)7** A(i, k) det(A[j]k}). 


k=1 
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Let C be the matrix obtained from A by replacing row j of A by row i of A. Then C(j, k) = 
A(i,k) and Clj|k] = Alj|&] for all &. So the preceding expression is the Laplace expansion 
for det(C) along row j. On the other hand, det(C’) = 0 because C' has two equal rows. So 
[A(adj A)](i, 7) = 0. We have proved that A(adj A) is a diagonal matrix with all diagonal 
entries equal to det(A), so that A(adj A) = det(A)J,,. The analogous result for (adj A)A is 
proved similarly, using column expansions. oO 


12.54. Corollary: Formula for the Inverse of a Matrix. If A € M,(R) and det(A) 
is a nonzero element of R, then the matrix A is invertible in M,,(R) with inverse 


1 
At = ——_ adj A. 
det(A) 4 
12.55. Remark. Conversely, if A is invertible in M,,(R), then det(A) is a nonzero element 
of R. The proof uses the following Product Formula for Determinants: 


det(AB) = det(A) det(B) = det(B) det(A) for all A, B € M,,(R). 


Taking B = A}, the left side becomes det(I,) = 1, so det(B) is a two-sided inverse of 
det(A) in R. In particular, det(A) and det(B) are nonzero. We deduce the Product Formula 
as a consequence of the Cauchy—Binet Theorem, which is proved in the next section. 


More generally, when R is replaced by a commutative ring R, the same proof shows that 
a matrix A is an invertible element of the ring M,,(R) iff det(A) is an invertible element of 
the ring R. 


DS 


12.10 The Cauchy—Binet Theorem 


We continue our study of determinants by giving a combinatorial proof of the Cauchy—Binet 
Theorem, which expresses the determinant of a product of rectangular matrices as the sum 
of products of determinants of certain submatrices. This proof is a nice application of the 
properties of inversions and determinants. 

To state the Cauchy—Binet Theorem, we need the following notation. Given a c x d 
matrix M, write M; for the ith row of M and M/ for the jth column of M. Given indices 


jise--sde € {1,2,...,d}, let (M%,..., M4) denote the c x c matrix whose columns are 
M?,...,M% in this order. Similarly, given i1,...,ig € {1,2,...,c}, let (Mi,,...,Mi,) be 
the d x d matrix whose rows are M;,,..., Mj, in this order. 


12.56. The Cauchy—Binet Theorem. Suppose m < n, A is an m x n matrix, and B is 
an n xX m matrix. Let J be the set of all lists j = (J1, jo,..-,jm) such that 1 < j1 < jo < 
2+ < Im <n. Then 


det(AB) = 5° det(A”, A?,..., A?) det(Bj,, Bj,,..-, Bjn)- 
jeJd 
Proof. All matrices appearing in the displayed formula are m x m, so all the determinants 


are defined. We begin by using the definitions of matrix products and determinants (§12.9) 
to write 


m 


det(AB) = S~ sgn(w) [][(AB)(G,w(a)) = S© sen(w)]] | So AG, bi) BUki, w(d)) 


weESm 4=1 weESm i=1 Lkj;=1 
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The Generalized Distributive Law changes the product of sums into a sum of products: 


det(AB) = S* So ++) S© sgn(w) [] AG, ki) [] BK, w@). 
weESm ki=1 km=l1 w=1 i=l 


Let K be the set of all lists k = (k1,..., km) with every k; € {1,2,...,n}, and let K’ be the 
set of lists in K whose entries k; are distinct. We can combine the m separate sums over 


ky,...,km into a single sum over lists k € K. We can also reorder the summations to get 
det(AB) = S> S© sgn(w Tl A(i, ki) | | B(ki, w(é)) 
ke K weESm al i=1 


Next, factor out quantities that do not depend on w: 


det(AB) = 5° ne ey sen(w) [] Bk, w(i)) 


kEK i=1 weESm 
The term in brackets is the defining formula for det(By,,,..., Bx,,). If any two entries in 
(k1,..-, km) are equal, this matrix has two equal rows, so its determinant is zero. Discarding 


these terms, we are reduced to summing over lists k € K’. So now we have 
det(AB) = 5° Lae k;) det(Br,,..-, Br,,)- 
keEK’ i=l 


To continue, observe that for every list k € K’ there exists a unique list j € J, denoted 
j = sort(k), obtained by sorting the entries of k into increasing order. Grouping summands 


gives 
det(AB)=S> SO [L4@e i) det(Br,,---,Bhp,)- 
jeJ kek’: t=1 
sort(k)=j 
Given that sort(k) = j, we can change the matrix (Bz,,...,Bx,,) into the matrix 


(B;,,...,B;,,) by repeatedly switching adjacent rows. Each such switch flips the sign of the 
determinant, and one checks that the number of row switches required is inv(k1k--- km). 
(To see this, adapt the proof of Theorem 7.29 to the case where the objects being sorted 
are j1 < jo < +++ < jm instead of 1 < 2 <--- <m.) Letting sen(k) = (—1)™V), we can 
therefore write 


det (AB) =~ — sgn(k te) [LAG i) det(B;,,..., B;,,). 


jeJ kek’: 
sort(k)=j 


The determinant in this formula depends only on j, not on k, so it can be brought out of 
the inner summation: 


det(AB) = J det(Bj,,..-,Bj,) 55  sgn(k) ]] AG, ki). 
jeJ kek’: i=1 
sort (k)=j 


To finish, note that every k € K’ that sorts to j can be written as (ky,...,km) = 
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(ju(1)>++++Ju(m)) for a uniquely determined permutation v € S,,. Since j is an increas- 
ing sequence, it follows that inv(k) = inv(v) and sgn(k) = sgn(v). Changing variables in 
the inner summation, we get 


m 


det(AB) = 5° det(B;,,...,.B;,,) | 55 sgn(v) [L4G jo(i)) 


je vESm 
The term in brackets is none other than det(A!,..., A”), so the proof is complete. oO 


12.57. The Product Formula for Determinants. If A and B are m x m matrices, 
then det(AB) = det(A) det(B). 


Proof. Take n = m in the Cauchy—Binet Theorem. The index set J consists of the single 
list (1,2,...,m), and the summand corresponding to this list reduces to det(A) det(B). O 


Other examples of combinatorial proofs of determinant formulas appear in §10.15 
and §12.11. 


II ee 
12.11 Tournaments and the Vandermonde Determinant 


This section uses the combinatorics of tournaments to prove Vandermonde’s determinant 
formula. 


12.58. Definition: Tournaments. An n-player tournament is a digraph 7 with vertex 
set {1,2,...,} such that for all vertices i 4 j, exactly one of the directed edges (i, 7) or 
(j,2) is an edge of r. Also, no loop edge (7,7) is an edge of r. Let T,, be the set of all n-player 
tournaments. 


Intuitively, the n vertices represent n players who compete in a series of one-on-one 
matches. Each player plays every other player exactly once, and there are no ties. If player 7 
beats player j, the edge (i, 7) is part of the tournament; otherwise, the edge (j,7) is included. 


12.59. Definition: Weights, Inversions, and Sign for Tournaments. Given a tour- 


nament T € T;, the weight of 7 is wt(r) = []j_,: ee The inversion statistic for 7, 


denoted inv(r), is the number of i < j such that (j,i) is an edge of 7. The sign of 7 is 
sgn(r) = (-1)™. 


Informally, wt(r) = x}! ---a¢ means that player i beats e; other players for all 7 between 
1 and n. If we think of the players’ numbers 1,2,...,7 as giving the initial rankings of each 
player, with 1 being the highest rank, then inv(r) counts the number of times a lower-ranked 
player beats a higher-ranked player in the tournament T. 


12.60. Example. Consider the tournament 7 € Ts with edge set 
{(1,3), (1,4), (1,5), (2,1), (2, 4), (3, 2), (3, 4), (3, 5), (5, 2), (5, 4). 
We have wt(r) = v323x3x2, inv(r) = 4, and sgn(r) = +1. 


12.61. Theorem: Tournament Generating Function. For all integers n > 1, 


> sgn(T) wt(7T) = II (xj — 23). 


TET n 1<i<j<n 
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Proof. We can build a tournament 7 € T;, by making a sequence of binary choices, indexed 
by the pairs (7,7) with 1 < i<j <n. For each i < j, we choose one of the edges (i, 7) or 
(j,2) and add it to the tournament’s edge set. Let us examine the effect of this choice on 
wt(r7), inv(7), and sgn(r). If we pick the edge (i,j) (so i beats 7), then the exponent of «; 
goes up by 1, inversions go up by 0, and the sign is unchanged. If we pick edge (j,7) instead 
(so j beats 2), then the exponent of x; goes up by 1, inversions go up by 1, and the sign is 
multiplied by —1. The generating function (+2; — z;) records the effect of this choice. The 
proof is completed by invoking the Product Rule for Generating Functions. O 


Given a tournament 7, there may exist three players u,v, w where u beats v, v beats w, 
and w beats u. This situation occurs whenever the digraph 7 contains a directed 3-cycle. 
Tournaments where this circularity condition does not occur are given a special name. 


12.62. Definition: Transitive Tournaments. A tournament 7 € T,, is transitive iff for 
all vertices u,v, w, if (u,v) and (v,w) are edges of 7, then (u, w) is an edge of T. 


Note that if (u,v) and (v, w) are edges in 7, we must have w 4 u, and then (w, wv) is an 
edge of 7 iff (u,w) is not an edge of 7. It follows that a tournament 7 is not transitive iff 
there exist vertices u,v,w such that (u,v), (v,w), and (w,u) are all edges of rT. 


12.63. Theorem: Generating Function for Transitive Tournaments. Let T/ be the 
set of transitive tournaments in T;,,. Then 


x sgn(T) wt(T) = S- sen(w) II cK)" 
k=1 


Tel} weSn 


Proof. We define a bijection f : T’ — S, that will be used to transfer signs and weights 
from TY to S,. Given r € T’, define an associated relation < on {1,2,...,n} by setting 
u x v iff u = v or (u,v) is an edge of r. This relation is reflexive, antisymmetric (since T is a 
tournament), and transitive (since 7 is transitive). Furthermore, for all u,v € {1,2,...,n}, 
uxXvorv X usince T is a tournament. So ~ is a total ordering of {1, 2,...,n}. This ordering 
determines a unique permutation w = f(r) € S, that satisfies w, < wg ~ --+ ~ wy. For 
each k, player wz beats players w,, for all m > k and loses to players w,, for all m < k. 
One readily checks that f is a bijection; the inverse map sends w € S, to the transitive 
tournament with edge set {(wi,w;):1<i<j <n}. 

Given 7 € T/ and w = f(T) € Sp, let us compare inv(7T) to inv(w). On one hand, inv(w) 
is defined as the number of i < 7 with w; > w;. On the other hand, the description of the 
edge set of r = f~'(w) at the end of the last paragraph shows that the number of i < j 
with w; > w; is also equal to inv(rT). So inv(r) = inv(w), and hence sgn(rT) = sgn(w). 

Next, let us express wt(7) in terms of w. Since player wz beats exactly those players 


Wm such that k <m <n, we see that wt(r) = [],;_, 2%, *. Define wt(w) by the right side 
of this formula. The theorem now follows because f is a weight-preserving, sign-preserving 
bijection. oO 


We can use the bijection f to characterize transitive tournaments. 


12.64. Theorem: Criterion for Transitive Tournaments. A tournament 7 € T), is 
transitive iff no two vertices of 7 have the same outdegree. 


Proof. Given a transitive r € T,,, let w = f(r) be the permutation constructed in the 
preceding proof. We have shown that wt(r) = [],;_, 2%". The exponents n — k are all 
distinct, so every vertex of 7 has a different outdegree. 

Conversely, suppose 7 € TJ), is such that every vertex has a different outdegree. There 
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are n vertices and n possible outdegrees (namely 0,1,...,2—1), so each possible outdegree 
occurs at exactly one vertex. Let w , be the unique vertex with outdegree n — 1. Then w 
beats all other players. Next, let wz be the unique vertex with outdegree n—2. Then wz must 
beat all players except w,. Continuing similarly, we obtain a permutation w = w,w2-::Wn 
of {1,2,...,n} such that w; beats w, iff 7 < k. To confirm that 7 is transitive, consider 
three players w;,w;, wz such that (w;,w,;) and (w;,wz) are edges of 7. Then i < 7 and 
j<k,soi<k, so (w;, wz) is an edge of r. (In fact, 7 = f~*(w).) Oo 
12.65. Theorem: Vandermonde Determinant Formula. Let 71,..., Ln be real num- 


? ? 


bers or formal variables. Define an n x n matrix V by setting V(i,7) = a forl<ij<n. 


Then 
det(V) = i (x4 = Xj). 


1<i<j<n 
Proof. According to Definition 12.40, 


n 


det(V) = $0 sgn(w) [][ V(k, w(k)) = SO sen(w) TT 2th. (12.13) 
k=1 


weSn k=1 weSn 


This is the generating function for transitive tournaments, whereas [],—;(%; — xj) is the 
generating function for all tournaments with n players. So, it suffices to define a sign- 
reversing, weight-preserving involution I : T,, + T,, with fixed point set T’. Define I(r) =r 
for 7 € T’. Now consider a non-transitive tournament 7 € T,,—T/. By Theorem 12.64, there 
exist two vertices 7 < j with the same outdegree in 7. If several pairs of vertices have the 
same outdegree, then choose the pair such that 7 and then j is minimized. Define I(r) by 
switching the roles of i and j in 7; more precisely, replace every directed edge (u,v) in T 
by (s;,;(u), si,;(v)), where s;,; is the transposition (7,7) € S,. The resulting tournament is 
non-transitive (since i and j still have the same outdegree in J(7)) and has the same weight 
as T. Furthermore, J(I(r)) =r. 

To finish, we show that sgn(J(r)) = — sgn(7). Consider the factorization of (i,7) € Sn 
into 2(j — i) — 1 basic transpositions: 


(9) =(G-Ld)G-2,5-1)-- G+Lt+2)G1+)G4+1Li+2)---G-2,9-DG-1,5). 


We can pass from 7 to I(r) in stages, by applying these basic transpositions one at a time 
to the endpoints of the directed edges in 7. We claim that each such step changes the sign 
of the tournament. For, consider what happens to the inversion count when we pass from 
a tournament o to o’ by switching labels k and k +1. The inversion (& + 1,k) is present 
in exactly one of the tournaments o and o’, and the other inversions are unaffected by the 


label switch. So inv(o’) differs from inv(o) by +1, and hence sgn(a’) = — sgn(c). Since we 
pass from 7 to I(r) by an odd number of moves of this type (namely 2(j — i) — 1 moves), 
we see that sgn(I(T)) = —sgn(T), as needed. O 


a 


12.12 The Hook-Length Formula 


This section presents a probabilistic proof of the Hook-Length Formula for the number of 
standard tableaux of a given shape. This formula was first stated in the Introduction. For 
the reader’s convenience, we begin by recalling the relevant definitions. 
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12.66. Definitions. An integer partition of n is a weakly decreasing sequence A = (A; > 
Ag >-++ > A) of positive integers with A; +---+ A; =n. The diagram of X is 


dg(A) = {(t,7) € Zs0 x Zy0: 1 Si < 1,1 <5 < AG}. 


Each (7,7) € dg(A) is called a box or a cell. We take i as the row index and j as the column 
index, where the highest row is row 1. Given any cell c = (i,j) € dg(A), the hook of c in 
is 
H(c) = {(i,k) € dg(A) : k > J} U{(K, 7) € dg(A) : k > 4}. 

The hook length of c in is h(c) = |H(c)|. A corner box of X is a cell c € dg(A) with 
h(c) = 1. A standard tableau of shape 2 is a bijection S : dg(A) > {1,2,...,n} such that 
S(i,7) < S(t,7 +1) for all 7,7 such that (i,j) and (4,7 + 1) are in dg(A), and S(i,j) < 
S(i+ 1,7) for all 7,7 such that (2,7) and (i+ 1,7) are in dg(A). Let SYT(A) be the set of 
standard tableaux of shape A, and let f* = | SYT(A)|. 


12.67. Example. If \ = (7,5,5,4,2,1) and c = (3,2), then 
A(c) = {(3, 2), (3,3), (3, 4), (8,5), (4,2), (5, 2)}, 


and h(c) = 6. We can visualize dg(A) and H(c) using the following picture. 


Let X', be the number of boxes in column j of dg(A). Then h(i, 7) = (Ai- 7) + (Aj - 2) +1. 
We use this formula to establish the following lemma. 


12.68. Lemma. Suppose J is a partition of n, (r,s) is a corner box of A, and (i, 7) € dg(A) 
satisfies i< rand j < s. Then h(i, 7) = h(r, 7) + h(i, s) — 1. 


Proof. Since (r,s) is a corner box, A, = s and A, = r. So 


h(r, 7) + h(i, s) —1 [Ar — 9) + Aj — 7) +1) + [Ar -— 8) + A,-)4+1)-1 
= 67+ ),-9+ 2p =e rt +1 


(As-—j) +0; -)+1=hG,97). O 


I 


12.69. The Hook-Length Formula. For any partition A of n, 


n! 


Teeaga) A(c) 


The idea of the proof is to define a random algorithm that takes a partition \ of n as 
input and produces a standard tableau S € SYT(A) as output. We prove in Theorem 12.74 
that this algorithm outputs any given standard tableau S' with probability 


_ Teeagia) h(c) 


n! 


p= 


This probability depends only on A, not on S', so we obtain a uniform probability distribution 
on the sample space SYT(A). So, on one hand, each standard tableau is produced with 
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probability p; and on the other hand, each standard tableau is produced with probability 
1/|SYT(A)| = 1/f?. Thus f* = 1/p, and we obtain the Hook-Length Formula. 

Here is an informal description of the the algorithm for generating a random standard 
tableau of shape X. Start at a random cell in the shape 4. As long as we are not at a corner 
box, we jump from our current box c to some other cell in H(c); each cell in the hook is 
chosen with equal probability. This jumping process eventually takes us to a corner cell. 
We place the entry n in this box, and then pretend this cell is no longer there. We are left 
with a partition py of size n — 1. Proceed recursively to select a random standard tableau 
of shape ys. Adding back the corner cell containing n gives the standard tableau of shape 
produced by the algorithm. 

Now we give a formal description of the algorithm. Every random choice below is to be 
independent of all other choices. 


12.70. Tableau Generation Algorithm. The input to the algorithm is a partition \ of 
n. The output is a tableau S € SYT(A), constructed according to the following random 
procedure. As a base case, if n = 0, return the empty tableau of shape 0. 


1. Choose a random cell c € dg(A). Each cell in dg(A) is chosen with probability 1/n. 
2. While h(c) > 1, do the following. 
2a. Choose a random cell c’ € H(c)—{c}. Each cell in H(c)—{c} is chosen with 
probability 1/(h(c) — 1). 
2b. Replace c by c’ and go back to Step 2. 
3. Now cis a corner box of dg(A), so dg(A)—{c} is the diagram of some partition py of 
n — 1. Recursively use the same algorithm to generate a random standard tableau 


S’ € SYT(y). Extend this to a standard tableau S € SYT(A) by setting S(c) =n, 
and output S as the answer. 


Let (c1,¢2,¢3,---,Ck) be the sequence of cells chosen in Steps 1 and 2. Call this se- 
quence the hook walk for n. Note that the hook walk must be finite, since h(c1) > 
h(c2) > h(c3) > +--+. Writing cs = (is,js) for each s, define J = {i1,...,in-1}—{t,} and 
J = {ji,..-,jr-1}—-{jn}. We call I and J the row set and column set for this hook walk. 


12.71. Example. Given n = 24 and \ = (7,5,5,4, 2,1), the first iteration of the algorithm 
might proceed as follows. 


In this situation, we place n = 24 in corner box c, and proceed recursively to fill in the rest 
of the tableau. The probability that the algorithm chooses this particular hook walk for n 
is 

; td 1 1 i i 2 4 


The row set and column set for this hook walk are J = {1} and J = {2,3}. 


The next lemma is the key technical fact needed to analyze the behavior of the tableau 
generation algorithm. 
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12.72. Lemma. Given a partition \ of n, a corner box c = (r,s), and sets I C {1,2,...,r— 
1} and J C {1,2,...,s— 1}, the probability that the hook walk for n ends at c with row 
set J and column set J is 


BAGlyd)— Wagaya Daeg 


Proof. Write I = {i1 < ig <-+++ < ig} and J = {j1 < jo < +++ < jm}, where €,m > 0. First 
we consider some degenerate cases. Say I = J = J. Then the hook walk for n consists of 
the single cell c. This happens with probability 1/n, in agreement with the formula in the 
lemma (interpreting the empty products as 1). Next, suppose I is empty but J is not. The 
hook walk for n in this case must be c) = (r,j1), C2 = (1, J2), ---5 Cm = (jm); Cm4+1 = (1, 8). 
The probability of this hook walk is 


Similarly, the result holds when J is empty and J is nonempty. 

Now consider the case where both J and J are nonempty. We use induction on |I|+|J|. A 
hook walk with row set J and column set J ending at c must begin with the cell c1 = (#1, 1); 
this cell is chosen in Step 1 of the algorithm with probability 1/n. Now, there are two 
possibilities for cell cg: either cg = (71, j2) or co = (i2, 71). Each possibility for cg is chosen 
with probability 1/(h(c)—1) = 1/(A(i1, j1)—1). When co = (41, jz), the sequence (c2,..., cx) 
is a hook walk ending at c with row set I and column set J’ = J—{j1}. By induction, such 
a hook walk occurs with probability 


limes SaaS 
jes 


However, since the walk really started at c, and proceeded to c2, we replace the first factor 
1/n by + : Weyot: Similarly, when cp = (%2, 71), the sequence (c2,...,cx) is a hook walk 
ending at c with row set [’ = J—{i,} and column set J. So the probability that the hook 
walk starts at c; and proceeds through cp = (é2, j1) is 


1 
n hie malas legs 


Adding these two terms, we see that 
1 1 1 I 
A,¢,1,J) = — es ee ee 
BSE) n h(c1) males Daya Gost as) 
The factor in parentheses is 


h(r, ji) + h(t, s) —2 
(A(i1, 8) — 1)(A(r, j1) — 1) 


Using Lemma 12.68, the numerator simplifies to h(i, 71) —1 = h(c1)—1. This factor cancels 
and leaves us with 


PGT) gaa Daya 


This completes the induction proof. O 
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12.73. Theorem: Probability that a Hook Walk ends at c. Given a partition » of n 

and a corner box c = (r,s) of dg(A), the probability that the hook walk for n ends at c is 
-1 


177 _ his) Fhe) 
~pllagg—illag 1 


’ 


Proof. Write [r — 1] = {1,2,...,r—1} and [s — 1] = {1,2,...,s—1}. By the Sum Rule for 
probabilities, 


p(A,c) = S- Sop (A, ¢, I, J) 


rain 1] JC[s—1] 


oy © Tea Dae 


eae 1] JC[s—1] tel 


= . d= oes a laa 


IC[r—1) te] JC[s—1] JES 


By induction on r, or by the Generalized Distributive Law, it can be checked that 


rol 1 
pm Vgeaci= (+a 


IC{[r—1] te] i=1 : i=1 


(cf. the proof of (4.3)). The sum over J can be simplified in a similar way, giving the formula 
in the theorem. O 


The next theorem is the final step in the proof of the Hook-Length Formula. 


12.74. Theorem: Probability of Generating a Given Tableau. If . is a fixed partition 
of n and S € SYT(A), the tableau generation algorithm for A outputs S with probability 


Meea(a 0) 


n! 


Proof. We prove the theorem by induction on n. Note first that the result does hold for 
n = 0 and n = 1. For the induction step, assume the result is known for partitions and 
tableaux with fewer than n boxes. Let c* = (r,s) be the cell such that S(c*) = n, let uw 
be the partition obtained by removing c* from dg(A), and let S’ € SYT(js) be the tableau 
obtained by erasing n from S. First, the probability that the hook walk for n (in Steps 1 
and 2 of the algorithm) ends at c* is p(A, c*). Given that this event has occurred, induction 
tells us that the probability of generating S’ in Step 3 is 


Tecag(n) Malo) 
(n-1)! ? 


where h,,(c) refers to the hook length of c relative to dg(). Multiplying these probabilities, 
the probability of generating S is therefore 


DN (i, s) — hy(r, 79) 
yp dO Ng ae 7 
cedg(p) rr J) 


Now, consider what happens to the hook lengths of cells when we pass from yp to A by 
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restoring the box c* = (r,s). For every cell c = (i,j) € dg(u) with « 4 r and j # s, we 
have h,(c) = hy(c). If ce = (i, 8) € dg(u) with i < r, then h,(c) = ha(c) —1 = ha(i, s) — 1. 
Thus, the fractions in the second product convert h,,(c) to hy(c) for each such c. Similarly, 
if c = (r,j) € dg(w) with 7 < s, then h,,(c) = hy(c) —1 = ha(r,j) — 1. So the fractions in 
the third product convert h,,(c) to hy(c) for each such c. So we are left with 


“ II ha(o) = Leeds AO) 


n! , 


” cédg (1) 


where the last equality follows since h,(c*) = 1. This completes the proof by induction. O 


12.13 Knuth Equivalence 


Let [N] denote the set {1,2,...,.N}, and let [N]* = Us. oLN]* be the set of all words using 
the alphabet [N]. Given a word w € [N]*, we can use the RSK algorithm to construct the 
insertion tableau P(w), which is a semistandard tableau using the same multiset of letters 
as w (see §9.23). This section studies some of the relationships between w and P(w). In 
particular, we show that the shape of P(w) contains information about increasing and 
decreasing subsequences of w. First we show how to encode semistandard tableaux using 
words. 


12.75. Definition: Reading Word of a Tableau. Let \ = (\i,...,Ax) be an integer 
partition, and let T € SSYT (A). The reading word of T is 


tw(T) = T(k,1),T(k,2),...,T(k, Ax), 
T(k—1,1),T(k—1,2),...,T(kK-1,An-1),  ---, 
T(1, 1), T(, 2), ae) T(, A1)- 
Thus, rw(T) is the concatenation of the weakly increasing words appearing in each row 
of T, reading the rows from bottom to top. Note that T(j,\;) > T(7,1) > Tj — 1,1) for 


all 7 > 1. This implies that we can recover the tableau T from rw(T) by starting a new row 
whenever we see a strict descent in rw(T). 


12.76. Example. Given the tableau 


the reading word of T is rw(T) = 463578245661123446. Given that the word w = 
7866453446223511224 is the reading word of some tableau S', we deduce that S must be 


by looking at the descents in w. 
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Next we introduce two equivalence relations on [N]* that are related to the map sending 
w € [N]* to P(w). 


12.77. Definition: P-Equivalence. Two words v,w € [N]* are called P-equivalent, 
denoted v =p w, iff P(v) = P(w). 


12.78. Definition: Knuth Equivalence. The elementary Knuth relation of the first kind 
on [N]* is 


Ky = {(uyrzv, uyzrv): u,v € [N]*,2,y,2 €[N], andr<y< z}. 
The elementary Knuth relation of the second kind on [N]* is 
Ko = {(urzyv, uzzyv) : u,v € [N]*,2,y,2 € [N], and a < y < z}. 


Two words v, w € [N]* are Knuth equivalent, denoted v =x w, iff there is a finite sequence 
of words v = v°,v!,v?,...,v" = w such that, for all i in the range 1 < i < k, either 


(vit v’) € Ki U Ko or (v', v'“!) € Ki U Ko. 


12.79. Remark. Informally, Knuth equivalence allows us to modify words by repeatedly 
changing subsequences of three consecutive letters according to certain rules. Specifically, 
if the middle value among the three letters does not occupy the middle position, then the 
other two values can switch positions. To determine which value is the middle value in the 
case of repeated letters, use the rule that the letter to the right is larger. These comments 
should aid the reader in remembering the inequalities in the definitions of Ky and Ko. 


It is routine to check that =p and =x are equivalence relations on [N]*. Our current goal 
is to prove that these equivalence relations are equal. First we show that we can simulate 
each step in the Tableau Insertion Algorithm 9.46 using the elementary Knuth relations. 


12.80. Theorem: Reading Words and Knuth Equivalence. For all w € [N]*, 
w =x tw(P(w)). 


Proof. First note that, for any words u,z,v,v’ € [N]*, if v =x v’ then uvz =x uv’z. 
Now, fix w = wiw2---we € [N]* and use induction on k. The theorem holds if k < 1, 
since rw(P(w)) = w in this case. For the induction step, assume & > 1 and write T’ = 
P(w w+: wr-1) and T = P(w). By the induction hypothesis, w1---wr-1 =K rw(T’), so 
w = (wi -+ + We-1) We =K Tw(T’)wg. Therefore, it suffices to prove that rw(T’)w, is Knuth 
equivalent to rw(T’) = rw(T’ © w,). We prove this by induction on @, the number of rows 
in the tableau T”. 

For the base case, let ¢ = 1. Then rw(T”) is a weakly increasing sequence wu U2 +++ Up-1. 
If up—1 < we, then T is obtained from T’ by appending w, at the end of the first row. 
In this situation, rw(T’)w, = u1-++Ug—1We = rw(T), so the required result holds. On the 
other hand, if wy < ug—1, let j be the least index with wz < u;. When inserting w; into 
T’, wx will bump u; into the second row, so that 


rw(T) = ujyuiug ++: Uj; 1 WRUj 41° UR-1- 


We now show that this word can be obtained from rw(T’)w, = ui +++ Ug—1Wz by a sequence 
of elementary Knuth equivalences. If 7 < k — 2, then wy < uz—2 < ux—1 implies 


(uy "1 Uk—-3Uk—-2WkUR-1, U1 °** Up—3Uk—2Uk—-1Wk) € Ky. 


So rw(Z’)w, is Knuth-equivalent to the word obtained by interchanging w, with the letter 
Urp—1 to its immediate left. Similarly, if 7 < k — 3, the inequality we < upz—3 < up—e2 
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lets us interchange w, with uz-2. We can continue in this way, using elementary Knuth 
equivalences of the first kind, to see that 


/ —_— 
rw(T")we =K U1 +++ Uj 1 Uj WRU; 41° Ue-1- 
Now, we have uj_1 < wre < uj, so an elementary Knuth equivalence of the second kind 
transforms this word into 
Ups Uj—-QUjUj—-1WRUj41 °° Uk-1- 


If 7 > 2, we now have uj;-2 < uj-1 < uj, SO we can interchange u; with u;-2. We can 
continue in this way until wu; reaches the left end of the word. We have now transformed 
rw(T")w,; into rw(T) by elementary Knuth equivalences, so rw(T")w, =x rw(T). 

For the induction step, assume £ > 1. Let T” be the tableau T’ with its first (longest) 
row erased. Then rw(T”) = rw(I”)u1--- up where ui <--- < up is the weakly increasing 
sequence in the first row of T’. If up < wz, then rw(T’)w, = rw(Z). Otherwise, assume wy 
bumps wu, in the insertion T’ ~ w,. By the result in the last paragraph, 


rw(T")w, =K rw(T" Jujuy ++ Uj 1 WRUj41 + ** Up- 
Now, by the induction hypothesis, rw(I’”’)u; =x rw(I” < u;). Thus, 
rw(T")we =x rw(T” + u;)u’ 


where wu’ is uw, +--+ Up, with u,; replaced by w,. But, by definition of tableau insertion, rw(Z’) = 
rw(T” < u,)u’. This completes the induction step. O 


12.81. Example. Let us illustrate how elementary Knuth equivalences implement the steps 


in the insertion T «+ 3, where 
[3 | 4] 


T= 


Appending a 3 at the right end of rw(T), we first compute 
34 2245 113446 3 =K 34 2245 1134436 =~ 34 2245 1134346 =~ 
34 2245 1143346 =K 34 2245 1413346 =K 34 2245 4 113346. 


The steps so far correspond to the insertion of 3 into the first row of 7’, which bumps the 
leftmost 4 into the second row. Continuing, 


34 22454 113346 =K 34 22544 113346 = 34 25244 113346 =K 34 5 2244 113346, 


and now the incoming 4 has bumped the 5 into the third row. The process stops here with 


the word 
[1 [1] 3|3[4]6| 
3452244113346 = rw =rw(T < 3). 


We see that rw(T)3 =x rw(T + 3), in agreement with the proof above. 


12.82. Definition: Increasing and Decreasing Subsequences. Let w = w we2--: Wn € 
[N]*. An increasing subsequence of w of length £ is a subset I = {i1 < ig < +--+ < ig} of 
{1,2,...,n} such that wi, < wi, < +--+: < wi,. A decreasing subsequence of w of length £ 
is a subset I = {i1 < ig <--- < ig} such that wi, > wi, > +--+: > wi,. A set of k disjoint 
increasing subsequences of w isaset {I1,...,I,} of pairwise disjoint increasing subsequences 
of w. For each integer k > 1, let incg(w) be the maximum value of |J;| +--+ + |I,| over 
all such sets. Similarly, let dec,(w) be the maximum total length of a set of k disjoint 
decreasing subsequences of w. 
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12.83. Theorem: Knuth Equivalence and Monotone Subsequences. For all v, w in 
[N]* and all k € Zs1, v =x w implies inc, (v) = incg(w) and decg(v) = dec, (w). 


Proof. It suffices to consider the case where v and w differ by a single elementary Knuth 
equivalence. First suppose x,y,z € [N], 


v=ayrzb, w=ayzrb, r<y<z, 


and the y occurs at position 7. If J is an increasing subsequence of w, then i+1 and i+2 do 
not both belong to J (since z > x). Therefore, if {),..., J, } is any set of & disjoint increasing 
subsequences of w, we can obtain a set {J{,..., Ij, } of disjoint increasing subsequences of v 
by replacing i+ 1 by 7+2 and7+2 by i+1 in any J; in which one of these indices appears. 
Since |J{| ++---+|J,| = || +---+ [Iz], we deduce that inc,(w) < inc;(v). 

To establish the opposite inequality, let I = {l1, lo,...,I,} be any set of k disjoint 
increasing subsequences of v. We construct a set of k disjoint increasing subsequences of 
w having the same total size as I. The device used in the previous paragraph works here, 
unless some member of I (say J,) contains both i+1 and i+ 2. In this case, we cannot have 
i € Ii, since y > x. If no other member of I contains i, we replace J, by (11—{i + 2}) U {7}, 
which is an increasing subsequence of w. On the other hand, suppose i+ 1,i+2€ J), and 
some other member of I (say J2) contains 7. Write 


Th = {hi <jo<-1+<jp<t+1<it2 < jryr <--+ < Jp}, 
In = {ki < kg <--+< hs <i< key < +--+ < kg}, 


and note that vj, <2 <z<v;,,, and up, < y < vg,,,. Replace these two disjoint increasing 
subsequences of v by 


Eos {hi <jo<-++ <p <i 42 < beoga < 00+ < key}, 
ts = {ki < kg <-++< kg <i <t4t1 < jpg < +--+ < Jp}. 


Since w;, < «© < we,,, and wr, < y < z < wj,,,, I, and Jj are two disjoint increasing 
subsequences of w having the same total length as J; and Jz. This completes the proof that 
inc,(w) > inc;,(v). 

Similar reasoning proves the result in the case where 


v=arzyb, w=azryb, andu<y<z. 
We also ask the reader to prove the statement about decreasing subsequences. O 


12.84. Theorem: Subsequences and the Shape of Insertion Tableaux. Assume w 
is in [N]* and P(w) has shape A. For all integers k > 1, 


incn(w) =A, +--+ +Ap, decy(w) = AL +--+: + AQ. 


In particular, 1 is the length of the longest increasing subsequence of w, whereas ¢() is 
the length of the longest decreasing subsequence of w. 


Proof. Let w’ = rw(P(w)). We know w =x w’ by Theorem 12.80, so inc,(w) = inc, (w’) 
and dec, (w) = decz,(w’) by Theorem 12.83. So we need only prove 


incg(w’) = Ay +--+ + Ak, dec;,(w’) = Ay +--+ + Aj. 


Now, w’ consists of increasing sequences of letters of successive lengths A;,...,A2, 1, where 
1 = &(A). By taking Ny, Io,...,I, to be the set of positions of the last k of these sequences, 
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we obtain k disjoint increasing subsequences of w’ of length A; + --- + Ax. Therefore, 
inc,(w’) > Ay +--+ + Akg. 

On the other hand, let {I,,..., J, } be any k disjoint increasing subsequences of w’. Each 
position 7 in w’ is associated to a particular box in the diagram of A, via Definition 12.75. 
For example, position 1 corresponds to the first box in the last row, while the last position 
corresponds to the last box in the first row. For each position 7 that belongs to some J;, 
place an X in the corresponding box in the diagram of A. Since entries in a given column of 
P(w) strictly decrease reading from bottom to top, the X’s coming from a given increasing 
subsequence J; must all lie in different columns of the diagram. It follows that every column 
of the diagram contains k& or fewer X’s. Suppose we push all these X’s up their columns as 
far as possible. Then all the X’s in the resulting figure must lie in the top k rows of AX. It 
follows that the number of X’s, which is |J;| +--++ |Z,|, cannot exceed Ay +--+ + Ax. This 
gives inc,(w’) < A, +---+Ax. The proof for dec;,(w) is similar. O 


12.85. Theorem: Knuth Equivalence and Tableau Shape. For all v,w € [N]*, v =x 
w implies that P(v) and P(w) have the same shape. 


Proof. Fix v,w € [N]* with v =x w. Let A and yp be the shapes of P(v) and P(w), 
respectively. Using Theorems 12.83 and 12.84, we see that for all k > 1, 


Ap = inck(v) — incp_-1(v) = ince(w) — incg_1(w) = px. Oo 


12.86. Example. Consider the word w = 35164872. As shown in Figure 9.1, we have 


[1 | 216} 7] 
P(w) =[3/4]8| 
5 | 


Since the shape is A = (4,3,1), the longest increasing subsequence of w has length 4. 
Two such subsequences are J; = {1,2,4,7} (corresponding to the subword 3567) and Iz = 
{1,2, 4,6}. Note that the first row of P(w), namely 1267, does not appear as a subword of 
w. Since the column lengths of \ are (3, 2,2, 1), the longest length of two disjoint decreasing 
subsequences of w is 3+ 2 = 5. For example, we could take I; = {6,7,8} and Iz = {4,5} to 
achieve this. Note that w’ = rw(P(w)) = 5 348 1267. To illustrate the end of the previous 
proof, consider the two disjoint increasing subsequences J; = {1,4} and Iz = {2,3,7,8} of 
w’ (this pair does not achieve the maximum length for such subsequences). Drawing X’s in 
the boxes of the diagram associated to the positions in J; (respectively Iz) produces 


ees — LL Ixt 
| | [XxX respectively [X|[X]_| . 
LX] || 


Combining these diagrams and pushing the X’s up as far as they can go, we get 


X]X]X]X] 
Xx] [Xx] 
|_| 


So, indeed, the combined length of I; and Iz does not exceed A, + A2. 


The next lemma provides the remaining ingredients needed to establish that P- 
equivalence and Knuth equivalence are the same. 


12.87. Lemma. Suppose v,w € [N]* and z is the largest symbol appearing in both v 
and w. Let v’ (respectively w’) be the word obtained by erasing the rightmost z from v 
(respectively w). If v =x w, then v’ =x w’. Furthermore, if T = P(v) and T’ = P(v’), 
then T’ can be obtained from T by erasing the rightmost box containing z. 
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Proof. Write v = azb and w = czd where a,b,c,d € [N]* and z does not appear in b 
or d. First assume that v and w differ by a single elementary Knuth relation. If the triple 
of letters affected by this relation are part of the subword a, then a =x c and b = d, so 
v’ = ab =x cd = w’. Similarly, the result holds if the triple of letters is part of the subword 
b. The next possibility is that for some x,y with « < y < z, 


v =a’yxzb and w =a'yzab 


or vice versa. Then v’ = a'yab = w’, so certainly v’ =x w’. Another possibility is that for 
some x,y with « <y < z, 
v =cxzyb’ and w = czayb’ 


or vice versa, and we again have v’ = cxyb’ = w’. Since the z under consideration is the 
rightmost occurrence of the largest symbol in both v and w, the possibilities already consid- 
ered are the only elementary Knuth relations that involve this symbol. So the result holds 
when v and w differ by one elementary Knuth relation. Now, if v = v°, v!,v?,...,v" =w 
is a sequence of words as in Definition 12.78, we can write each v’ = a'zb’ where z does 
not appear in b’. Letting (v’)’ = a’b’ for each i, the chain v’ = (v°)’, (v')’,..., (v*)! = w’ 
proves that v’ =x w’. 

Now consider the actions of the Tableau Insertion Algorithm 9.46 applied to v = azb 
and to v’ = ab. We prove the statement about T and T” by induction on the length of b. 
The statement holds if b is empty. Assume b has length & > 0 and the statement is known 
for smaller values of k. Write b = b’x where x € [N]. Define T] = P(ab’) and T, = P(azb’). 
By induction hypothesis, Tis T, with the rightmost z erased. By definition, T’ = (T{ < x) 
and T = (J; < a). When we insert the x into these two tableaux, the bumping paths are 
the same (and hence the required result holds), unless « bumps the rightmost z in 7). If 
this happens, the rightmost z (which must have been the only z in its row) gets bumped 
into the next lower row. It comes to rest there without bumping anything else, and it is 
still the rightmost z in the tableau. Thus it is still true that erasing this z in T produces 
T’. The induction is therefore complete. oO 


12.88. Theorem: P-Equivalence and Knuth Equivalence. For all v, w € [N]*, v =p 
w iff v =K w. 


Proof. First, if v =p w, then Theorem 12.80 shows that v =x rw(P(v)) = rw(P(w)) =x 
Ww, SO V =x w by transitivity of =x. Conversely, assume v =x w. We prove v =p w by 
induction on the length k of v. For k < 1, we have v = w and so v =p w. Now assume 
k > 1 and the result is known for words of length s — 1. Write v = azb and w = czd 
where z is the largest symbol in v and w, and z does not occur in b or d. Write v’ = ab 
and w’ = cd. By Theorem 12.87, v’ =x w’, P(v’) is P(v) with the rightmost z erased, 
and P(w’) is P(w) with the rightmost z erased. By induction, P(v’) = P(w’). If we knew 
that P(v) and P(w) had the same shape, it would follow that P(v) = P(w). But P(v) and 
P(w) do have the same shape, because of Theorem 12.85. So v =p w. Oo 


We conclude with an application of Theorem 12.84. 


12.89. The Erd6és—Szekeres Subsequence Theorem. Every word of length exceeding 
mn either has a weakly increasing subsequence of length m+ 1 or a strictly decreasing 
subsequence of length n+ 1. 


Proof. Suppose w is a word with no increasing subsequence of length m+1 and no decreasing 
subsequence of length n+ 1. Let A be the shape of P(w). Then Theorem 12.84 implies that 
Ai < mand &(A) < n. Therefore the length of w, which is ||, can be no greater than 
Ae(A) < mn. O 
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12.14. Quasisymmetric Polynomials 


This section introduces generalizations of symmetric polynomials called quasisymmetric 
polynomials. Our main goal is to prove a combinatorial formula expanding Schur polyno- 
mials as linear combinations of fundamental quasisymmetric polynomials. The proof of this 
formula illuminates the relationship between semistandard tableaux and standard tableaux. 


12.90. Definition: Quasisymmetric Polynomials. Given a € ZX), let delo(a) be the 
sequence obtained by deleting all zeroes in a. We say that two exponent sequences a, 8 € 
ZX are shifts of each other iff delo(a@) = delo(G). A polynomial f € R[1,..., 2] is called 
quasisymmetric iff the coefficients of x° and x* in f are equal whenever a and £ are shifts 
of each other. Let Qy be the set of all quasisymmetric polynomials in N variables. For 
k > 0, let Q&, be the set of all f € Qn that are homogeneous of degree k. 


12.91. Example. For N = 4, the sequences (2,1, 2,0), (2,1,0,2), (2,0,1,2), and (0,2, 1,2) 
are all shifts of each other, but are not shifts of (2, 2,1,0). The polynomial 


f = 8a} rex3 + 3x2 xox} + Bajagr7 + 8032324 + Qaia3a3 + Qn? aear4 + Qa rr, + Qx30304 
is quasisymmetric but not symmetric. All monomials in f have degree 5, so f € Q}. 


One readily checks that Q and each Q‘, is a real vector space, and Qy is the direct sum 
of its subspaces Q*,. Qy is also closed under polynomial multiplication, so Qy is a subring 
of R[z,,...,n~] and a graded algebra over R. Moreover, every symmetric polynomial is a 
quasisymmetric polynomial, so Ay C Qy and A‘, C Q*, for all N and k. The next step 
is to find bases for the vector spaces Q‘,. We know that for N > k, Ak, has many bases 
indexed by integer partitions of k. On the other hand, we are about to see that Q*& has 
bases indexed by compositions of & or by subsets of [& — 1] = {1,2,...,4— 1}. 

Recall that a composition of k is a sequence of positive integers, say a = (a1, Q2,..., Qs), 
with >;_, a; = k. We write |a| = k and ¢(a) = s. Integer partitions are compositions where 
the parts occur in weakly decreasing order. Let Comp(k) be the set of all compositions of k, 
and let Compy(k) be the set of compositions of k with at most N parts. For a € Compy(k), 
we define a; = 0 for (a) < i < N. The Composition Rule (proved in §1.12) says that 
|Comp(k)| = 2*-! for all k > 1. In the proof of that rule, we defined a bijection from 
Comp(k) onto the set {0,1}*~1. There is also a bijection from {0,1}*~1 onto the set of all 
subsets of [& — 1], sending a bit string bi b2---bg_1 to the subset {7 € [k—1] : b; = 1}. When 
we compose these bijections, a = (a1,...,@s) € Comp(k) maps to the subset 


sub(a) = {a1,a1 Faas Oh be +as—1} c [k — 1). 
The inverse bijection sends a subset S' = {11 < ig <--+ <a} C [k — 1] to the composition 


comp($) = (41, 72 — #1, 43 — t2,..., 4 — 4-1, k — i) € Comp(k). 


Now we can define our first basis for quasisymmetric polynomials, which is analogous 
to the monomial basis for symmetric polynomials. 


12.92. Definition: Monomial Quasisymmetric Polynomials. Given N > 0, k > 0, 
and a € Compy(k), the monomial quasisymmetric polynomial in N variables indexed by 


a is 
M,(#1,.--.,;2N) = S xP, 
BEZN,: 
delo(8)=delo (a) 
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12.93. Example. For N = 5 and a = (3, 2,3), Ma(a1,...,25) is 


2 2 2 2 2 2 2 2 B28 so eins 
gintas + atatel +asasas +atate) +asakas + of eiee +asree) + alee + rare +afeie. 


In Example 12.91, f = 3M 2,1,2) + 2M (2,21): 


More generally, every quasisymmetric polynomial f € Q‘, can be expanded uniquely as 
a linear combination of My where a € Compy(k). This fact can be proved by adapting the 
proof of Theorem 9.23. The intuition for the proof is that M, groups together all monomials 
x? that must have the same coefficient as x* in any quasisymmetric polynomial. Hence, we 
have the following result. 


12.94. Theorem: Monomial Basis of Ov: For allk > 0 and N > 0, 
{Ma(«1,...,UN) : a © Comp, (k)} 


is a basis for the vector space Q‘,. So for all N > k, the dimension of Q‘, is |Comp(k)| = 
2°", 


Recall the definitions of elementary and complete symmetric polynomials: 


ek = y Ui Vint, and hh= y De Dig eee Lies 


1<i1 <in<-<in SN 1<ty <Sig<-'<in<N 


The next definition generalizes these formulas by allowing mixtures of strict and weak 
inequalities among the subscripts 7;. 


12.95. Definition: Fundamental Quasisymmetric Polynomials. Given k > 0, N > 0, 
and S C [k — 1], the fundamental quasisymmetric polynomial in N variables indexed by k 
and S is 

FQ, 5(21,-.-,2N) = ba Uj, Vig + Li,- 


11 Sta S++ Sip SN: 
JESS1j <tj41 


Intuitively, the set S used to index FQ; ¢ consists of the positions 7 where we are forced 
to have a strict increase i; < ij;,1 in the subscript sequence for the z-variables. We prove 
in Theorem 12.97 that each FQ; 5 is quasisymmetric, as the name suggests. 


12.96. Example. For N = 3 and k = 3, we have 


FQ39 = 1 +234+234 jr. + xfx3 + 0523 + 0105 + 2103 + 2203 + 212223 
= Mo) + Mea + Ma) + Maj) = hs; 
FQ3 4} = ©1035 + v193 + wor} + 12203 = Mao) + Maan; 
FQ3(0} = @j€2+aj23 + x03 + 210203 = Mary) + May; 
FQ3 41,2} = &1%2X%3 = M1,1) = e3. 


More generally, FQ;, 9 = he and FQ; j,_1) = ex for all k and N, so that the fundamental 
quasisymmetric polynomials interpolate between the symmetric polynomials hz and ex. The 
patterns in the preceding example also suggest how to write FQ, ¢ as a linear combination of 
the M,,. The next theorem gives the general formula, which also proves that the polynomials 
FQ;,5 really are quasisymmetric. 
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12.97. Theorem: Monomial Expansion of Fundamental Quasisymmetric Poly- 
nomials. For all k > 0, N > 0, and S$ C [k — 1], 


FQ;,9(@1,---,2N) = es M,(a1,.--,@N)- 


a€Comp y (k): 
SCsub(a) 


Therefore, FQ; 5 € QW. 
Proof. The key observation is the following reformulation of the definition of MQ: for all 
a € Compy(k), 


M,(x1,...,¢n) = x, Lj, Lig’ ** Lig. (12.14) 


1St1 Sig S++ St, SN: 
jEesub(a) Si; <ij41 


We explain this formula through an example where a = (3,2,3) and sub(a) = {3,5}. In 
this case, the right side of (12.14) is 


= a 
- Di, Lig + Lig = 5 ©, 24,03, = Mi3,2,3)- 


1<i1 S12 S13 <tasis <ig=i7=ig <N 1<t1 <ia<ig<N 


In general, the condition j € sub(a) = i; < i;41 ensures that the right side of (12.14) is 
the sum of all monomials whose exponent sequences are shifts of x°, and this sum is Mq. 

Let Z be the set of weakly increasing sequences I = (t1 < ig <--- < ix) with each 
i; in [N]. For I € Z, define the ascent set Asc(I) = {j € [k —1] : 2; < ij;41}, and let 
Xy = Xj, Vj, +++ @;,. 50 far, we know that 


FQ;.5 = » xy and M,= » xX]. 
IeZ: SCAsc(L) IeT: sub(a)=Asc(I) 


In the sum for FQ, g, let us group together all terms indexed by subscript sequences J that 
have the same ascent set. We get 


FQ.s= dy x 
T: SCTC[k-1] \IeZ: T=Asc(I) 
Replacing each subset T by the associated composition a = comp(T), this becomes 


FQ;.,s = », ¥ xp] = s My. 


a€Compy(k): \IEZ: sub(a)=Asc(L) a€Comp jg (k): 
SCsub(a) SCsub(a) 


Finally, FQ;,.5 € Qk, follows because we have written FQ;,5 as a linear combination of the 
M., which form a basis for Or. O 


For the rest of this section, we assume that N > k, so that bases of Q*, are indexed by 
all compositions of & (or all subsets of [k — 1]). 


12.98. Theorem: Fundamental Quasisymmetric Basis of Q*,. For all N > k, 
{FQp 6(%15<.-,09) 28 Clk= 1} 


is a basis for the vector space Q‘,. 
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Proof. Define column vectors F = (FQ; 5 : S ¢ [k —1]) and M = (Meompcsy : S € [k — 1]). 
Define a matrix A, with rows and columns indexed by subsets of [k — 1], such that the entry 
of A in row S and column T is 1 if S C T and 0 otherwise. Order the rows and columns of 
A, F, and M using a fixed total ordering on the set of subsets of [k — 1] such that smaller 
subsets precede larger subsets in the ordering. Theorem 12.97 says that for all S C [k — 1], 


FQ:.5= >, A(S,T)Meomp(r)- 
TC[k-1] 


In matrix notation, this becomes F = AM. 

It now suffices to show that A is an upper-triangular matrix with 1’s on the diagonal, 
as in the proof of Theorem 9.43. For each S, the diagonal entry of A in row S, column S 
is 1, since S C S. Next, if the $,7-entry of A is nonzero for some S 4 T, then S € T, so 
|S| <|T|. This means that S precedes T in the chosen ordering on subsets, so this entry of 
A appears above the diagonal, as needed. O 


An inclusion-exclusion calculation yields an explicit formula for the inverse of the tran- 
sition matrix A in the preceding proof. This leads to a formula expanding M, as a linear 
combination of fundamental quasisymmetric polynomials (see Exercise 12-98). Our next re- 
sult describes the fundamental quasisymmetric expansion of Schur symmetric polynomials. 
Recall that for a standard tableau U with n cells, Des(U) is the set of all j € {1,2,...,n—1} 
such that 7 + 1 appears in a lower row than 7 in U. 


12.99. Theorem: Fundamental Quasisymmetric Expansion of Schur Polynomi- 
als. For all N > and all A € Par(n), 


$\(@1,---,2nN) = y FQ, Des(U7) (£1) +--+) ZN): 
UESYT(A) 


Proof. Expanding the definitions of s, and FQ, pes(y), We Must prove 


ev. DY x 
TESSYT y (A) UESYT(A) Tez: 
Des(U) CAsc(I) 
where Z is the set of weakly increasing subscript sequences I = (i, < ig <--- < in) with 
each i; € [N], Asc(I) = {j : 4; < ij41}, and xy = a, ---a;,,. Let X = SSYTy(A), and let 
Y be the set of pairs (U,I), where U € SYT(A), I € Z, and Des(U) C Asc(Z). It suffices to 
define a weight-preserving bijection F.: X — Y. 

Given a semistandard tableau T € X, we compute F(T) = (U,I) as follows. Suppose 
T has ky, 1’s, kg 2’s, and so on. Because T' is semistandard, the k; cells containing j in T’ 
form a horizontal strip for every 7. To create the standard tableau U, we use the following 
standardization algorithm. Replace the ky 1’s in T, from left to right, with the integers 
1,2,...,k,. Then replace the kz 2’s in T, from left to right, with the integers ky + 1, kh, + 
2,...,k, +k. In general, replace the k; copies of j in T, from left to right, with the integers 
(ics Ri) +1, ic; hi) +2,---, Ui; ki) +4;- Furthermore, let J be the weakly increasing 
sequence consisting of k, 1’s, kz 2’s, and so on. For example, 


[1 [213 [610 
= | {afslsfoy 1112223333455 


We must check that (U, I) does belong to the claimed codomain of F’. First, is U really 
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a standard tableau? On one hand, U is a filling containing the integers 1,2,...,n = |A| once 
each. Note that if 7 < 7 are two values somewhere in 7, the standardization process always 
relabels 7 with a lower integer than 7. If there are multiple copies of 7 in T’, these copies get 
relabeled with an increasing sequence of consecutive integers moving from left to right. We 
see from these comments that since the rows of T’ weakly increase, the rows of U strictly 
increase, and similarly for columns. Thus U is standard, as needed. 

Next, is Des(U) C Asc(I)? We prove the contrapositive: fix k € {1,2,...,n—1} with 
k ¢ Asc(I), and show k ¢ Des(U). We have assumed i, = ig41 = 1. By definition of 
standardization, the unique copies of k and k+1 in U were used to relabel two occurrences 
of i in T. Now, the cells containing i in T’ form a horizontal strip that is relabeled from 
left to right. So 4 +1 must appear strictly right and weakly above k in U, which means 
k ¢ Des(U). We have now proved that F(T) = (U,I) is in Y. By definition of I, x7 = xz, 
so F is weight-preserving. 

To finish, we construct a map G: Y > X that is the two-sided inverse of F’. Given 
(U,I) € Y, let T = G(U,I) be the filling of shape A obtained by replacing each k in U by 


iz. For example, 
[1/34 [5] 6] 
G{ [2/7/8fll] 1833445566788 a 
[9 [LOL 213) 616] 8/8] 


Since U is standard and J is weakly increasing, the new filling T has weakly increasing rows 
and columns. Since Des(U) C Asc(Z), we see (as in the previous paragraph) that every run 
of equal values in J is used to relabel a horizontal strip of cells in dg(A). Thus T has strictly 
increasing columns, so T € SSYT (A). Finally, it is routine to check that Fo G = idy and 
GoF= idx. E] 


12.15 Pfaffians and Perfect Matchings 


Given a square matrix A with N rows and N columns, we have defined the determinant of 


A by the formula 
N 


det(A) = ~~ sgn(w) | [ AG, w(a)) 


wesn i=l 


(see §12.9). This section studies the Pfaffian, which is a number associated to a triangular 
array (a;,; : 1<i<j < N) where N is even. Pfaffians arise in the theory of skew-symmetric 
matrices. 


12.100. Definition: Skew-Symmetric Matrices. An N x N matrix A is called skew- 
symmetric iff A* = —A, which means A(i,7) = —A(j,7) for all i,j € {1,2,..., N}. 


If A is a real or complex skew-symmetric matrix, then A(i,i) = 0 for all 7. Moreover, 
A is completely determined by the triangular array of numbers (A(i,j): 1<i<j < N) 
lying strictly above the main diagonal. The starting point for the theory of Pfaffians is the 
observation that, for all even N and all skew-symmetric A, det(A) is a perfect square. (For 
odd N, the condition A’ = —A can be used to show that det(A) = 0.) 


12.101. Example. A general skew-symmetric 2 x 2 matrix has the form A = | Bs . | : 
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In this case, det(A) = a? is a square. A skew-symmetric 4 x 4 matrix looks like 


0 a bo e¢ 
-—a O de 
oe —b -d OO f 
-—-c -e -f 0 


A somewhat tedious calculation reveals that 
det (A) a’ f? + be? + cd? — 2abef + 2acdf — 2bede 
(af + cd — be)’. 


l| 


The remainder of this section develops the theory needed to explain the phenomenon 
observed in the last example. 


12.102. Definition: Pfaffians. Suppose N is even and A is a skew-symmetric N x N 
matrix. Let SPfy be the set of all permutations w € Sj such that 


Wy <wW3 <W5 <st+ CWN-1, Wi < We, W3 < W4, Ws < We, ..., and wy_1 < wy. 
The Pfaffian of A, denoted Pf(A), is the number 
Pf(A) = » sgn(w)A(w1, w2)A(w3, ws) A(ws, we) +> A(wn-1, ww). 
weSPfn 


12.103. Example. If N = 2, SPfz = {12} and Pf(A) = A(1,2) (we write permutations in 
one-line form here). If N = 4, SPf4 = {1234, 1423, 1324} and 


Pf(A) = A(1, 2)A(3, 4) + A(1, 4) A(2, 3) — A(1,3)A(2, 4). 


For a general N x N matrix A, det(A) is a sum of |Sv| = N! terms. Similarly, for a 
skew-symmetric matrix A, Pf(A) is a sum of |SPfy | terms. 


12.104. Theorem: Size of SPfy. For even N > 0, |SPfy|=1-3-5-...-(N-1). 


Proof. We can construct each permutation w € SPfy as follows. First, w; must be 1. There 
are N — 1 choices for w2, which can be anything other than 1. To finish building w, choose 
an arbitrary permutation v = v1,v2---un—2 € SPfn_2. For 7 between 1 and N — 2, set 


‘ihe = u,+1 if vu, << we—1; 
42) vy, +2 otherwise. 


Informally, we are renumbering the v’s to use symbols in {1,2,...,N}—{wi,w2} = 
[N]—{1, w2} and then appending this word to w iw2. By the Product Rule, |SPfy | = 
(N —1)-|SPfy_2|. Since | SPf2 | = 1, the formula in the theorem follows by induction. O 


Recall that the Laplace expansions in Theorem 12.51 provide recursive formulas for 
evaluating determinants. Similar recursive formulas exist for evaluating Pfaffians. The key 
difference is that two rows and columns are erased at each stage, whereas in Laplace ex- 
pansions only one row and column are erased at a time. 


12.105. Theorem: Pfaffian Expansion along Row 1. Suppose N is even and A is an 
N x N skew-symmetric matrix. For each i < j, let A[[i,7]] be the matrix obtained from A 
by deleting row i, row j, column 7, and column 7; this is a skew-symmetric matrix of size 
(N — 2) x (N — 2). We have 


P£(A) = _(—1)’ ACL, #) PE(AI[L, JI). 


Me 


i 
to 


Jj 
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Proof. By definition, 


N 
Pf&(A)= S° sgn(w) [] A(wi, wis). 
“odd 


wesPfin a 
i 


sen 


By the proof of Theorem 12.104, there is a bijection SPfy — {2,3,...,N} x SPfny_e 
that maps w € SPfy to (j,v), where 7 = we and v is obtained from w3w4---wy by 
renumbering the symbols to be 1,2,..., N —2. We use this bijection to change the indexing 
set for the summation from SPfy to {2,...,N} x SPfy_2. Counting inversions, we see 
that inv(w) = inv(v) + j — 2 since w2 = j exceeds j — 2 symbols to its right. So sgn(w) = 
(—1)/ sgn(v). Next, A(wi,we) = A(1,j). For odd i > 1, it follows from the definitions 
that A(w;, wizi) = A[[1, J]](vi-2, vi-s). Putting all this information into the formula, and 
replacing 7 by i+ 2 in the product over odd 7 from 3 to N, we see that 


N N-2 
Pf(A) = S0(-1P A(1, 9) S sen(v) TAM, aT (e%, 2:41). 
j=2 veSPfn_2 a=1 
i odd 
The inner sum is precisely Pf(A[[1, 7]]), so the proof is complete. Oo 


12.106. Example. Let us compute the Pfaffian of the matrix 


0 x -y 0 0 O 

—-x 0 0 y 0 O 

= y 0 0 zc -y O 

a 0 -y -7 0 O y 

0 0 Yy 0) Ov 

0 0 O -y -—a O 

Expanding along row 1 gives 

0 x -y 0 0 Yy 0 O 
_ -x 0 i | a —-y 0 O y 
Pf(A) = aPf 0 0 ct (—y) Pf 0 0 0 
O -y -a O O -y -a 0 


By expanding these 4 x 4 Pfaffians in the same way, or by using the formula in Exam- 
ple 12.103, we obtain 


Pf(A) = x(x? + y?) + y(wy) = 2° + Qay?. 
The combinatorial significance of this Pfaffian evaluation is revealed in §12.16. 


Pfaffians are closely related to perfect matchings of graphs, which we now discuss. 


12.107. Definition: Perfect Matchings. Let G be a simple graph with vertex set V 
and edge set LE. A perfect matching of G is a subset M of E such that each v € V is the 
endpoint of exactly one edge in MW. Let PM(G) be the set of perfect matchings of G. 


12.108. Example. For the graph shown in Figure 12.13, one perfect matching is 


M, = {{1, 6}, {2, 10}, {3, 9}, {4, 8}, {5, 7H}. 


Another perfect matching is 


Mp2 = {{1, 2}, {3, 4}, {5, 7}, (6, 9}, {8, LO}}. 
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FIGURE 12.13 
Graph used to illustrate perfect matchings. 


A perfect matching on a graph G is a set partition of the vertex set of G into blocks of 
size 2 where each such block is an edge of G. Therefore, if G has N vertices and a perfect 
matching exists for G, then N must be even. The next result shows that perfect matchings 
on a complete graph can be encoded by permutations in SPfy. 


12.109. Theorem: Perfect Matchings on a Complete Graph. Suppose N is even and 
Ky is the simple graph with vertex set {1,2,...,N} and edge set {{i,j}:1<i<j<N}. 
The map f :SPfy > PM(Ky) defined by 


f(wiwe - wn) = {{w1, wz}, {w3, wa}, sey {WNn-1, wn }} 
is a bijection. Consequently, 
|PM(iky)| =1-3-5-...-(N—1). 


Proof. Note first that f does map into the set PM(iy). Next, a matching M ¢ PM(Ky) 
is a set of N/2 edges M = {{i1, ig}, {t3, ia},..., {iw-1,inw}}. Since {7,7} = {7,7}, we can 
choose the notation so that i, < ig, ig < i4,..., and in_1 < iy. Similarly, since the N/2 
edges of M can be presented in any order, we can change notation again (if needed) to 
arrange that i; < t3 <%5 <+++:< ty_y. Then the permutation w = 7i,i2i3---iy is in SPfy 
and satisfies f(w) = M. Thus f maps onto PM(Kvy). To see that f is one-to-one, suppose 
v = jijoj3-+:jn is another element of SPfy such that f(v) = M = f(w). We must have 
ji, = 1 = %,. Since M has only one edge incident to vertex 1, and since {i1,i2} € M and 
{ji,j2} € M by definition of f, we conclude that ig = jg. Now iz and j3 must both be 
the smallest vertex in the set {1,2,...,N}—{i1,72}, so ig = jg. Then i, = j4 follows, as 
above, since M is a perfect matching. Continuing similarly, we see that 7; = j, for all k, so 
v =w and f is one-to-one. Since f is a bijection, the formula for | PM(ity)| follows from 
Theorem 12.104. O 


The preceding theorem leads to the following combinatorial interpretation for Pfaffians. 
Given a perfect matching IM € PM(Ky), use Theorem 12.109 to write M = f(w) for a 
unique w € SPfy. Define the sign of M to be sgn(w), and define the weight of M to be 


N 
wt(M) = i Lig = iil Twi wit. 


ij}eM i=1 
{iJ}€ i odd 
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where the x;,; (for 1 < i < 7 < N) are formal variables. Let X be the skew-symmetric 
matrix with entries x;,; above the main diagonal. It follows from Theorem 12.109 and the 
definition of a Pfaffian that 


S$” sgn(M) wt(M) = Pf(X). 


Me€PM(Kw) 
More generally, we have the following result. 


12.110. Theorem: Pfaffians and Perfect Matchings. Let N be even, and let G be 
a simple graph with vertex set V = {1,2,...,N} and edge set E(G). Let x;,; be formal 
variables, and let X = X(G) be the skew-symmetric matrix with entries 


Lig if 1< J and {i,j} E E(G); 
X(i,j)=<{ —a,; ifi > 7 and {i,j} € E(G); 
0 otherwise. 


Then > MePM(G) sen(M) wt(M) = Pf(X(G)). 
Proof. We have already observed that 


S> sgn(M) wt(M) = Pf(X(Ky)). (12.15) 
MeEPM(Kw) 


Given the graph G, let € be the unique algebra homomorphism on the polynomial ring 
Riz; : 1 < i<j < NJ that sends 2;,; to x, if {i,7} € E(G) and sends x;,; to 0 
if {i,7} ¢ E(G). (See the Appendix for more discussion of evaluation homomorphisms.) 
Applying € to the left side of (12.15) produces 


S > sgn(M) wt(M), 


MeEPM(G) 


since all matchings of Ky that use an edge not in E(G) are mapped to zero. On the other 
hand, since ¢€ is an algebra homomorphism and the Pfaffian of a matrix is a polynomial 
in the entries of the matrix, we can compute e(Pf(X(Ky))) by applying € to each entry 
of X(ky) and taking the Pfaffian of the resulting matrix. So, applying € to the right side 
of (12.15) gives 

e(P£(X(Ky))) = Pf(e(X(Kw))) = P£(X(G)). O 


12.111. Remark. The last result shows that Pf(X(G)) is a signed sum of distinct mono- 
mials, where there is one monomial for each perfect matching of G. Because of the signs, 
one cannot compute |PM(G)| by setting 2; = 1 for each {i,j} € E(G). However, for 
certain graphs G, one can introduce extra signs into the upper part of the matrix X(G) to 
counteract the sign arising from sgn(M/). This process is illustrated in the next section. 


We can now give a combinatorial proof of the main result linking Pfaffians and deter- 
minants. 


12.112. Theorem: Pfaffians and Determinants. For every even N > 0 and every 
N x N skew-symmetric matrix A, det(A) = Pf(A)?. 


Proof. First we use the skew-symmetry of A to cancel some terms in the sum 


N 
det(A) = S© sgn(w) Il A(i, w(2)). 


weSn 
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We can cancel every term indexed by a permutation w whose functional digraph contains 
at least one cycle of odd length (see §3.6). If w has a cycle of length 1, then w(i) = i for 
some i. So A(i,w(i)) = A(i,i) = 0 by skew-symmetry, and the term indexed by this w 
is zero. On the other hand, suppose w has no fixed points, but w does have at least one 
cycle of odd length. Among all the odd-length cycles of w, choose the cycle (i1,%2,...,%%) 
whose minimum element is as small as possible. Reverse the orientation of this cycle to 
get a permutation w’ # w. For example, if w = (3,8,4)(2,5,7)(1,6)(9,10), then w’ = 
(3, 8, 4)(7, 5, 2)(1,6)(9, 10). In general, sgn(w’) = sgn(w) since w and w’ have the same 
cycle structure (see Theorem 7.34). However, since k is odd and A is skew-symmetric, 


A(i1, i2)A(t2, 13) +++ A(in—1, th) A(te, 1) = —A(t2, 21) Alig, 2) +++ A(t, tx—-1) A(41, te). 


It follows that the term in det(A) indexed by w’ is the negative of the term in det(A) 
indexed by w, so these two terms cancel. Since the map sending w to w’ is an involution, 
we conclude that 


N 
det(A) = S> sen(w) T] AG, w(0), 
wessy 4=1 
where S&? denotes the set of permutations of {1,2,...,N} with only even-length cycles. 


The next step is to compare the terms in this sum to the terms in Pf(A)?. Using the 
distributive law to square the defining formula for Pf(A), we see that 


PEA? = SYS) So sgn(u)sgn(v) [J [A(us, wins) A(os, via )]- 


u€SPfy vESPfin 4 odd 


For each w € S% indexing an uncanceled term in det(A), we associate a pair g(w) = 
(u,v) € SPfx indexing a summand in Pf(A)? as follows. Consider the functional digraph 
G(w) with vertex set {1,2,...,N} and edge set {(¢,w(t)) : 1 <i < N}, which is a disjoint 
union of cycles. Define a perfect matching M, on G(w) (viewed as an undirected graph) 
by starting at the minimum element in each cycle and including every other edge as one 
travels around the cycle. Define another perfect matching M2 on G(w) by taking all the 
edges not used in M,. Finally, let w and uv be the permutations in SPfy that encode My, 
and Mp via the bijection in Theorem 12.109. For example, if w = (1,5, 2,8,6,3)(4, 7), then 
M, = {{1, 5}, {2,8}, {6,3}, {4, 7}} and Mz = {{5, 2}, {8, 6}, {3, 1}, {7, 4}}, so u = 15283647 
and v = 13254768. 

The function g sending w to (u, v) is a bijection from $<? to SPfi. Given (u,v) € SPfx, 
we find g~!(u,v) as follows. First take the union of the perfect matchings encoded by u 
and v. This produces a graph that is a disjoint union of cycles of even length, as is readily 
checked. One can restore the directions on each cycle by recalling that the outgoing edge 
from the minimum element in each cycle belongs to the matching encoded by u. For example, 
the pair (u,v) = (15234867, 12374856) maps to g~1(u, v) = (1,5, 6, 7,3, 2)(4, 8). 

Throughout the following discussion, fix w € 9%? and (u,v) € SPfX with (u,v) = g(w). 
To complete the proof, it suffices to show that the term in det(A) indexed by w equals the 
term in Pf(A)? indexed by (u,v). Write w in cycle form as 


w = (m41,N1,...,21)(Me,N2,-.-., 22) ++: (Me, Nk, -- +5 Zk) 


where m, < m2 < +--+ < mx are the minimum elements in their cycles. Define two words 
(permutations in one-line form) 


* 
Uu 


M1N1°°° 21 MeNe:+:Z ++: MEN +++ Zk} 


v* = Ny ZzMyz Ness ZMe +++ Ness ZMp. 
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Thus u* is obtained by erasing the parentheses in the particular cycle notation for w just 
mentioned, and v* is obtained similarly after first cycling the values in each cycle one step 
to the left. Since each m,; is the smallest value in its cycle, it follows that 


inv(v*) = N—k +inv(u*), 


where k = cyc(w) is the number of cycles in w. Using Theorem 7.34, we get 
sgn(u*)sgn(v*) = (—1)%~eve™) = sgn(w). Since all the edges (i, w(i)) in G(w) arise by 
pairing off consecutive letters in u* and v*, we have 


N 
sen(w) [] Ali, w(i)) = sen(u*) sen(o*) TT (Alus, uk.) ACot, vf.) 

i=1 i odd 
We now transform the right side to the term indexed by (u, v) in Pf(A)?, as follows. Note that 
the words u* and v* provide non-standard encodings of the perfect matchings My and M2 
encoded by u and v (where u* encodes the matching {{uj, v3}, {u3, uj}, ...}, and similarly 
for v*). To convert these encodings to the standard encodings, first reverse each pair of 
consecutive letters uj, uj, in u* such that uj > uj,, and 7 is odd. Each such reversal causes 
sgn(u*) to change, but this change is balanced by the fact that A(uj,,,uyz) = —A(uj, uj,,). 
Similarly, we can reverse pairs of consecutive letters in v* that are out of order. The next 
step is to sort the pairs in u* to force uy < u3 < us < ++: < un—y. This sorting can be 
achieved by repeatedly swapping adjacent pairs a < b;c < din the word, where a > c anda 
is in an odd position. The swap sending a, b,c, d to c,d, a, b can be achieved by applying the 
two transpositions (a,c) and (b,d) on the left. So this modification of u* does not change 
sgn(u*), nor does it affect the product of the factors A(uj,uj,,) (since multiplication is 
commutative). Similarly, we can sort the pairs in v* to obtain v without changing the 
formula. We conclude finally that 


N 
sen(w) [] AG, w(@) = sen(u*)sen(v*) [] [A(u?, wi.) A(o?, v8.1) 


i odd 


sgn(u)sgn(v) |] [A(wi, ups) A(vi, vi41)]- O 
i odd 


The following example illustrates the calculations at the end of the preceding proof. 
12.113. Example. Suppose w = (3,8)(11, 4, 2,9)(1, 10,6, 7,5, 12) € Sf$, so k = cyc(w) = 
3. We begin by writing the standard cycle notation for w: 

w = (1,10,6,7,5,12)(2,9, 11, 4)(3, 8). 


Next we set 

u* = 1,10;6,7;5,12;2,9;11, 4; 3, 8; v* = 10,6; 7,5; 12,1; 9,11; 4, 2; 8,3. 
Observe that inv(v*) = inv(u*) + (12 — 3) due to the cyclic shifting of 1,2,3, so that 
sen(u*) sgn(v*) = (—1)!2-% = sen(w). Now we modify u* and v* so that the elements in 
each pair increase: 

u’ = 1,10; 6, 7;5, 12; 2,9; 4, 11;3,8; v’ = 6,10; 5,7; 1,12; 9, 11; 2, 4; 3,8. 
Note that sgn(u’) = — sgn(u*) since we switched 11 and 4, but this is offset by the fact that 
A(11, 4) = —A(4, 11). So sgn(u*) TJ, A(uZ, u,1) = sen(u’) [], A(uj, uj41), and similarly for 
v* and v’. Finally, we sort the pairs so that the minimum elements increase, obtaining 

u = 1,10; 2,9;3,8;4, 11; 5, 12; 6, 7; v = 1,12; 2, 4; 3, 8;5,7;6, 10; 9, 11. 


This sorting does not introduce any further sign changes, so we have successfully transformed 
the term indexed by w in det(A) to the term indexed by (u,v) in Pf(A)?. 
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12.16 Domino Tilings of Rectangles 


This section presents P. W. Kasteleyn’s proof of a formula for the number of ways to tile a 
rectangle with dominos. Let Dom(m, n) be the set of domino tilings of a rectangle of width 
m and height n. This set is empty if m and n are both odd, so we assume throughout that 
m is even. Given a tiling T € Dom(m,n), let N;(T') and N,(T) be the number of horizontal 
and vertical dominos (respectively) appearing in T. Define the weight of the tiling T to be 
wt(T) = oNa(T)yNo(P), 


12.114. Theorem: Domino Tiling Formula. For all even m > 1 and all n > 1, 


m/2 on 


_ 9mn/2 2 eoaz | IT 2 pga ( AE 
¥ wt(T) = 2 II if x? cos (=) + y? cos (=). (12.16) 


TEDom(m,n) 4=1 k=1 


Setting z = y = 1 gives the expression for | Dom(m, n)| stated in the Introduction. 


Step 1: Conversion to a Perfect Matching Problem. Introduce a simple graph 
G(m,n) with vertex set V = {1,2,...,mmn} and edge set E = E, U Ey, where 


E,={{k,k+1}:k#0 (modm)}, EB, ={{k,k+m}:1<k<m(n-1)}. 


This graph models an m x n rectangle R, as follows. The unit square in the ith row from 
the bottom and the jth column from the left in R corresponds to the vertex (i—1)m-+-j, for 
1<i<nand1<j<~m. There is an edge in EF, for each pair of two horizontally adjacent 
squares in R, and there is an edge in EF, for each pair of two vertically adjacent squares 
in R. There is a bijection between the set Dom(m,n) of domino tilings of R and the set 
PM(G(m, n)) of perfect matchings of G(m,n). Given a domino tiling, we need only replace 
each domino covering two adjacent squares by the edge corresponding to these two squares. 
This does give a perfect matching, since each square is covered by exactly one domino. If a 
tiling T corresponds to a matching M under this bijection, we have N;,(T) = |Mn E,| and 
N,(T) =|M 1 E,|. So, defining wt(M) = 2/MF2lylMOFul | we have 


S> owt (T) = a wt(M). 


TEDom(m,n) MEPM(G(m,n)) 


12.115. Example. Figure 12.14 shows the rectangle R and associated graph G(m,n) when 
m=4andn=5. 

Figure 12.15 shows a domino tiling of R and the associated perfect matching. The tiling 
and matching shown both have weight x*ty°®. 


Step 2: Enumeration via Pfaffians. Let X, be the skew-symmetric matrix defined in 
Theorem 12.110, taking G there to be G(m,n). We know that 


> sen(M) JJ a3 = P£(X1). (12.17) 


M€EPM(G(m,n)) i<j{i,j}eM 


We introduce the terms horizontal edge, odd vertical edge, and even vertical edge to refer 
(respectively) to edges in E,, edges {k,k +m} in EB, with k odd, and edges {k,k +m} in 
E, with k even. Consider the algebra homomorphism e€ : R[{x;,;}] > R[x, y] that sends 2;,; 
to x if {i,j} is a horizontal edge, sends 2;,; to y if {i,j} is an even vertical edge, and sends 
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17/18 |19 | 20 ; 
13 /14|15 |16 13 14 15 16 
9 |10|11 12 —> 


6|71|8 9 10 11 12 
1/2|)3\4 C 5 
R 5 6 7 8 
O— 0 — 2 
1 2 3 4 

G(4,5) 


FIGURE 12.14 
Graph used to model domino tilings. 


FIGURE 12.15 
A domino tiling and a perfect matching. 


xi; to —y if {2,7} is an odd vertical edge. Let X be the matrix obtained by applying this 
homomorphism to each entry of the matrix X,. Explicitly, X is the mn x mn matrix with 
entries 


x ifj=i+1landi#0 (mod m) 

y ifj=i+mandi=0 (mod 2); 

-y ifj=i+mandi=1 (mod 2); 
X(i,j)=¢ -x ifi=j+landj #0 (mod m); (12.18) 

-y ifi=j+mandj=0 (mod 2); 

y ift=j+mandj=1 (mod 2); 

0 otherwise. 


For example, the matrix X when m = 4 and n = 3 appears in Figure 12.16. Let 
sen*(M) = sgn(M)(—1)', where t is the number of odd vertical edges in M. Applying 
the ring homomorphism € to each side of (12.17) gives 


> sgn*(M) wt(M) = Pf(X). 
MeEPM(G(m,n)) 


Step 3: Sign Analysis. The crucial fact to be verified is that sgn*(M7) = +1 for every 
perfect matching M. Before proving this fact, we consider an example. 
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0 x 0 0 -y O 0 0 0 0 0 0 
—-z 0 x 0 0 y 0 0 0 0 0 0 
0 -a 0O x 0 0 -y O 0 0 0 O 
0 0 -a O 0 0 0 y 0 0 0 0 
y 0 0 0 0 va 0 0 -y O 0 O 
0 -y O 0 -x O x 0 0 y 0 O 
0 0 y 0 0 -a 0 x 0 0 -y O 
0 0 0 -y O 0 -a« 0 0 0 O y 
0 0 0 0 y 0 0 0 0 « 0 O 
0 0 0 0 0 -y O 0 -az O xz 0 
0 0 0 0 0 0 y 0 0 -r 0 « 
0 0 0 0 0 0 0 -y O 0 -a O 


FIGURE 12.16 
Matrix used to enumerate domino tilings (m = 4,n = 3). 


12.116. Example. Consider the following domino tiling of a 16 x 4 rectangle: 


4 
3 
2 
1 


123 4 5 6 7 8 9 10 11 12 13 14 15 16 


This tiling corresponds to a perfect matching M of G(16,4), which is encoded (as in Theo- 
rem 12.109) by a word w € SPf¢4. By definition, sgn(M7) = (—1)””). In our example, the 
word of M is 


w = 1,2;3,4;5, 21; 6, 7; 8, 24; 9, 25; 10, 11; 12, 13; 14, 15; 16, 32; 
17,33; 18, 19; 20, 36; 22, 38; 23, 39; 26, 27; 28, 44; 29, 30; 31, 47;...; 60, 61; 62, 63. 


Note that w consists of pairs of letters indicating the two squares occupied by each domino in 
the tiling. We imagine placing dominos on the board one at a time, in the order specified by 
w, and updating sgn(/) and sgn*(J/) as we go along. When computing inv(w), the second 
symbol in each pair sometimes causes inversions with symbols following it in w. Pairs 
corresponding to horizontal dominos never cause any inversions. Consider the inversions 
caused by a vertical domino (i.e., a vertical edge in 4). The first vertical edge appearing in 
w is {5,21}. The 21 is greater than the fifteen symbols 6,7,...,20 corresponding to squares 
to the right of column 5 in row 1 and squares to the left of column 5 in row 2, which have 
not been covered by a domino yet. So this edge increases inv(w) by 15 = m — 1, which 
causes a sign change in sgn(V/). However, since this edge is an odd vertical edge, that sign 
change is counteracted in sgn*(M). 

The next vertical edge in w is {8, 24}. The symbol 24 causes 14 = m — 2 new inversions, 
corresponding to squares to the right of column 8 in row 1| and squares to the left of column 
8 in row 2, excluding column 5. These inversions do not change sgn(//), and sgn*(M) is 
also unchanged since {8, 24} is an even vertical edge. 

Continuing similarly, we eventually come to the odd vertical edge {23, 39} in w. Recalling 
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the order of domino placement, we see that the 39 causes inversions with the following nine 
symbols to its right in w: 37, 35, 34, 31, 30, 29, 28, 27, 26. Since nine is odd, we get a sign 
change in sgn(M), but this is counteracted in sgn*(M) since we have just added an odd 
vertical edge. After accounting for all the dominos, we find (Exercise 12-110) that indeed 
sgn*(M) = +1, since the insertion of each vertical domino never leads to a net sign change. 


Now we are ready to prove that sgn*(/) = +1 for a general M € PM(G(m,n)). Let w € 
SPfinn be the word encoding M. As in the example, we calculate sgn*(M) = (—1)™v)(-1)! 
incrementally by scanning w from left to right. Initially, before scanning any edges, sgn*(/) 
is +1. Suppose the next edge in the scan is the horizontal edge {k, k + 1}. By definition of 
w (see Theorem 12.109), k is the smallest symbol that has not appeared previously in w. 
So k and k +1 cannot cause any new inversions with symbols following them. Similarly, t 
(the number of odd vertical edges) does not increase when we scan this edge. So sgn*(M) 
is still +1 after scanning this edge. 

Before continuing, we need the following observation: for every row 7 > 1, the number 
of vertical dominos that start in row 7 and end in row i + 1 is even (possibly zero). This is 
proved by induction on 7. To prove the case i = 1, suppose there are a horizontal dominos 
in row 1. Then there must be m — 2a vertical dominos starting in row 1. This number is 
even, since m is even. Now assume the result holds in row 7 — 1. In row i, suppose there are 
a horizontal dominos, b vertical dominos coming up from row i — 1, and c vertical dominos 
leading up into row i+ 1. Then c = m — 2a — b. Since m is even and (by hypothesis) b is 
even, c must also be even. 

Now suppose the next edge in the scan is a vertical edge {k,k +m} in column j that 
covers rows i and i+1 (so k = (i-1)m+ 7). As before, the symbol k causes no new inversions. 
Let us count the inversions in w between k +m and symbols to its right. There are m — 1 
symbols that might cause inversions with k + m, namely k+1,k+2,...,4+(m-—1), but 
some of these symbols may have already appeared in w. Specifically, if there are a vertical 
dominos covering rows 7 and z+ 1 to the left of column j, and b vertical dominos covering 
rows i — 1 and 7 to the right of column j, then a+ b of the symbols just mentioned have 
already appeared in w. So, the inclusion of the new edge increases inv(w) by (m—1)—(a+b). 
Now, let there be b! vertical dominos covering rows i — 1 and 7 to the left of column 7, and 
c horizontal dominos in row i to the left of column j. Since m— 1 = 1 (mod 2), -a =a 
(mod 2), —b= b’ (mod 2) (by the observation in the last paragraph), and 2c = 0 (mod 2), 
we see that 

(m—1)-a-—b=1+a+b'+2c (mod 2). 


But 1+a+0b!+ 2c = 9 since a+ b’ + 2c counts all the columns left of column j in row i. We 
conclude, finally, that the increase in inv(w) caused by the insertion of the edge {k,& +m} 
has the same parity as the column index j. Since j and k have the same parity, the number 
of new inversions is odd iff the new vertical edge is an odd vertical edge. So there is no 
net change in sgn*(M) = (—1)"v™)(—1)! when we add this edge. This completes the proof 
that sen*(M) = +1. 


Step 4: Evaluation of the Pfaffian. Combining Steps 1 through 3 and Theorem 12.112, 
we have 
S> owt (T) = S- wt(M) = Pf(X) = ./det(X), 
TEDom(m,n) MeEPM(G(m,n)) 


where X is the mn x mn matrix defined by (12.18). So we are reduced to evaluating the 
determinant of X. The idea is to replace X by a similar matrix U~!XU whose determinant 
is easier to evaluate. For this purpose, we pause to introduce tensor products of matrices. 


12.117. Definition: Tensor Product of Matrices. If A = [a;,;] is any n x n matrix and 
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B is any m x m matrix, let A @ B be the mn x mn matrix given in block form by 


a,1B a1,2B tee a1nB 
eee 
QniBo an2zB +++ annB 


Formally, (A ® B)(m(ir 3 Ly - ig, m(j1 — 1) F j2) = A(i1, ji) B(ia, Je) for all 11, 91522; J2 
satisfying 1 < 71,91 <n and 1 < t2,j2 < m. 


The following properties of tensor products may be routinely verified: 

(a) (Ay + Ao) ®B = (A; ® B) + (Ag @ B) and A® (By + Bo) = (A @ By) +(A® Ba). 

(b) For any scalar c, (cA) ® B=c(A® B) = A® (cB). 

(c) (A1 @ By)(Az ® Bz) = (Ai Az) @ (Bi Bo). 

(d) If A and B are invertible, then (A@ B)~! = A~'@ B?. 

For every k > 1, let I, denote the k x k identity matrix, let Fj, denote the k x k diagonal 
matrix with diagonal entries —1,1,—1,1,...,(—1)*, let I, denote the k x k matrix with 1’s 
on the antidiagonal, and let Q; denote the k x k matrix with 1’s on the diagonal above 
the main diagonal, —1’s on the diagonal below the main diagonal, and 0’s elsewhere. For 
example, 


100 0 0 -1 0 0 0 0 
0 10 0 0 0 1 0 0 0 
Ir=|0 010 0], F=]} 0 0 -1 0 0 |, 

00 0 1 0 0 0 0 1 0 

000 0 1 0 0 0 0 -1 
000 0 1 0 1 0 0 0 
000 1 0 -1 0 1 0 0 
=]0 0100], Q=] 0 -1 0 1 =O 
0 10 0 0 0 oO -1 0 1 
100 0 0 0 0 O -1 0 

The definition of X in (12.18) can now be written 


X= £(In ® Qm) =F y(Qn & Fim). 
(Compare to Figure 12.16.) The following lemma can be established by routine calculations. 


12.118. Lemma: Eigenvectors of Q;. For0 <a<k+1and1<b<k, define complex 


numbers : i 
aan Ta as 7 
Ux (a,b) =4 sin (=). Au() = Bivos (A). 


For 1 <a,b<k, we have 
Uz (a or 1,b) aa Uz. (a -_ 1,b) = Ax (b)Uz(a, b). 


Therefore, the column vector [U;(1, b), Ug (2, b),...,Ux (a, 6)|® is an eigenvector of Q; asso- 
ciated to the eigenvalue ;,,(b). Let Ux = [Un(a, b)ji<a,p<k, and let Dy be the k x k diagonal 
matrix with diagonal entries \;,(b). Then Q;,U; = U,Dx, (—1)*Ux (a, b) = —U;(a,k + 1-6) 
for all a,b between 1 and k, and so F),U;, = —UxJh,. 


The columns of U; are linearly independent, because they are eigenvectors of Q, associ- 
ated to distinct eigenvalues. Therefore, U; is invertible, so the lemma gives U;, 10,U, = Dre 
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zr, 0 0 —Yys1 0 0 0 0 0 0 0 0 
0 arg —YS1 0 0 0 0 0 0 0 0 0 
0 —ys1 xr 0 0 0 0 0 0 0 0 0 
—YyS1 0 0 xrra 0 0 0 0 0 0 0 0 
0 0 0 0 ry 0 0 —ys2 0 0 0 0 
2 0 0 0 0 0 rr2 = —Yyse 0 0 0 0 0 
0 0 0 0 0 —ys2. -&rs 0 0 0 0 0 
0 0 0 0 —yse 0 0 xra 0 0 0 0 

0 0 0 0 0 0 0 0 ry 0 0 —Y83 
0 0 0 0 0 0 0 0 0 xr2 =—Ys3 0) 
0 0 0 0 0 0 0 0 0 —ys3 rs 0) 

0 0 0 0 0 0 0 0 —Ys3 0 0) rr 


FIGURE 12.17 
The matrix U-1XU for m = 4, n = 3; here rg = 2icos(a/5) and s, = 2icos(rb/4). 


and U, FU =—Ij,. Let U =U, ® Um, so U-l= U ® Stee Using properties of tensor 
products, we calculate 


UAL = 2(UZ" OU, 2807) UU, 8Un) +90, OUR NO.OF), 2U,) 


a(U,*InUn) ® (Uz QmUm) +y(U,QnUn) @ (Un FnUm) 
r(In ® Dm) -_ y(Dn ® Tah 


For example, if X is the matrix shown in Figure 12.16, then U~'XU is the matrix 
shown in Figure 12.17. In general, U~! XU is a block-diagonal matrix consisting of n mx m 
blocks. The bth block has entries —yA,,(b) on the anti-diagonal and entries xA,,(a) (for 
1 <a<_m) on the diagonal. Now, since m is even, we can reorder the rows and columns 
of each block into this order: 1,m,2,m—1,3,m—2,...,m/2,m/2+ 1. This reordering can 
be accomplished by performing an even number of row and column switches on U~!XU, 
so the determinant does not change. The new matrix is also block-diagonal, consisting of 
(mn/2) 2 x 2 blocks that look like 


TAm (a) —yAn(b) 
ie eee for l<a<m/2and1<b<n. 


Now, Am(m + 1— a) = 2icos(z(m + 1 — a)/(m+4 1)) = —2icos(ma/(m + 1)) = —Am(a). It 
follows that the determinant of the 2 x 2 block just mentioned is 


b 
— 27 hn (a)? — y?An(b)? = [2 cos? (=) ease? (=) . 
m n+1 
Finally, det(X) = det(U~1XU) is the product of these determinants as a ranges from 1 to 


m/2 and 6 ranges from 1 to n. Taking the square root of det(X) and factoring out powers 
of 2 produces formula (12.16). Remarkable! 
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Rational-Slope Dyck Paths. If gcd(r,s) = 1, then the number of lattice paths from (0,0) 
to (r,s) that never go below the line sz = ry is = ("**). For any lattice path ending at 
(r,s), the r +s cyclic shifts of this path are all distinct, and exactly one of them is a Dyck 


path of slope s/r. 


The Chung-Feller Theorem. A lattice path from (0,0) to (n,n) has & flaws iff the path 
has k north steps starting below y = x. For k between 0 and n, there are C, = ae") 


paths ending at (n,n) with & flaws. So the number of flaws in a random lattice path from 
(0,0) to (n,n) is uniformly distributed on {0,1,2,...,n}. 


Rook-Equivalence of Ferrers Boards. For each integer partition py, rg () is the number of 
ways to place k non-attacking rooks on Fy, = dg(), and Ry(z) = doysoTe(w)a*. For 
all partitions w = (uw > fo > ++: > fn > 0) and vy = (y > 1g > +++ > | > O) 
with |u| =n = |v|, we have R,,(x) = R,(x) iff the multisets [uj +2: 1 <7 < nj and 
[yj +i: 1<i< nl are equal. 


Parking Functions. A function f : {1,2,...,n}—> {1,2,...,n} is a parking function iff 
\{a : f(x) < i}| > i for all ¢ between 1 and n. There are (n + 1)"~! parking functions of 
order n. A bijection from parking functions to labeled Dyck paths is given by listing the 
labels {x : f(a) = i} in increasing order in column i for i = 1,2,...,n in turn, putting 
one label in each row from bottom to top. A bijection from labeled Dyck paths to trees 
is given by letting the children of a; be the labels in column i + 1, for all ¢ > 0 (where 
ao = 0 and aj,...,@, are the labels from bottom to top). 


Facts about Cyclic Groups. If G is a cyclic group of size n < oo, then G has a unique 
cyclic subgroup of size d for each divisor d of n, and these are all the subgroups of G. Any 
cyclic group of size d has ¢(d) generators, and hence n = ae ¢(d). If G is a group of 
size n with at most one subgroup of size d for each divisor d of n, then G must be cyclic. 
Hence, any finite subgroup of the multiplicative group of a field is cyclic. 


Counting Irreducible Polynomials. The size of a finite field must be a prime power. For each 
prime power q, there exists a field F' with g elements, which is unique up to isomorphism. 
For such a field F’, let I(n,q) be the number of monic irreducible polynomials of degree n 
in Fa]. Classifying elements in the field of size q” by their minimal polynomials in F[a] 
gives q” = aln dI(d,q). Hence, by Mobius inversion, I(n, ¢q) = cate q¢u(n/d), where 
pis the Mobius function from Definition 4.31. 


Subspaces of Vector Spaces over Finite Fields. A d-dimensional vector space over a q- 
element field has size g?. The number of k-dimensional subspaces of an n-dimensional 
vector space over a g-element field is the integer (7 e Each such subspace has a unique 
basis in reduced row-echelon form (RREF). The number of & x n RREF matrices with 
entries in a q-element field is thus (7 a 


Combinatorial Meaning of Tangent and Secant Power Series. tanz = S>~~_,(an/n!)2”, 
where a, counts permutations w satisfying w, < w2 > w3 < w4 >-:: > Wp; and secxr = 
ro (bn /n!)2”", where b, counts permutations w satisfying w, < wz > w3 <--> < Wn. 


Properties of Determinants. The determinant of a matrix A € M,,(R) is a multilinear, 
alternating function of the rows (or the columns) of A such that det(J,,) = 1. This means 
that det(A) is a linear function of any given row when the other rows are fixed, and the 
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determinant is zero if A has two equal rows; similarly for columns. We have det(A) = 
det(A). For triangular or diagonal A, det(A) = [];_, A(i, i). The Laplace expansions for 
det(A) along row & and column k are 


nm n 


det(A) = }0(—-1)**A(k, j) det(A[k|j]) = 50 (-1)'** AGG, k) det (AlilA]), 


j=l i=1 


where A[k|j] is A with row k and column j deleted. We have A(adj A) = (det(A))In = 
(adj A)A, so that A~! = (det(A))~! adj(A) when det(A) is invertible in R. Similar results 
hold with R replaced by any commutative ring R. 


The Cauchy-Binet Theorem. Given an m xX n matrix A and an n xX m matrix B with 
mon, 
det(AB) = y det(A”!,..., A’) det(B;,,.--, B;,,); 


1<ji<jo<i<jm<n 


where A? is the jth column of A, and B; is the jth row of B. In particular, det(AB) = 
det (A) det(B) for all n x n matrices A and B. 


Tournaments. A tournament is a digraph with no loop edges and exactly one directed 
edge between each pair of distinct vertices. A tournament 7 is transitive iff for all vertices 
u,v,w, if (u,v) and (v,w) are edges of 7, then (u,w) is an edge of r. Moreover, T is 
transitive iff 7 has no directed 3-cycle iff the list of outdegrees of the vertices of T has no 
repetitions. A sign-reversing involution exists that cancels all non-transitive tournaments, 
leading to the following formula for the Vandermonde determinant: 


det[2z”Vi<ij<n = » sgn(w) II wR) = II (x; — 23). 


weSn k=1 1<i<j<n 


The Hook-Length Formula. For a partition with n boxes, the number of standard 
tableaux of shape is n!/ TT cagcay 2(e), where h(c) is the hook length of cell c. This can be 
proved probabilistically by defining a random algorithm that generates each S € SYT(A) 
with probability [].cagc,) 2(e)/n!. To build S, start at a random cell in dg(A), then re- 
peatedly jump to a random new cell in the hook of the current cell until reaching a corner. 
Place n in this corner and proceed recursively to fill the other cells in dg(A). 


Knuth Equivalence and Monotone Subsequences of Words. Two words v and w are Knuth 
equivalent iff v can be changed into w by a sequence of moves of the form -+-yxz--- & 
--yzu-++ (where x < y < 2) or -+-azy-+: © +--+ zay--- (where « < y < z). These 
moves simulate tableau insertion (when applied to reading words), so every w is Knuth 
equivalent to the reading word of its insertion tableau P(w). Words v and w are Knuth 
equivalent iff P(v) = P(w). If P(w) has shape , then Ay +---+ A, is the maximum total 
length of a set of k disjoint weakly increasing subsequences of w, and Xi +--- + Xj, is the 
maximum total length of a set of & disjoint strictly decreasing subsequences of w. 


Quasisymmetric Polynomials. A polynomial f € R[a1,...,2y] is quasisymmetric iff the 
coefficients of x° and x? in f are equal whenever the exponent sequences a and £ are shifts 
of each other. For a composition @ with at most N parts, the monomial quasisymmetric 
polynomial M, is the sum of all monomials x° where 8 is a shift of a. The set {M. : 
a € Compy(k)} is a basis of the vector space Q*, of quasisymmetric polynomials that are 
homogeneous of degree k. For S C [k — 1], the fundamental quasisymmetric polynomial 
FQ,;,5 is the sum of all products 2x;,%;,-+-xj, where 1,...,7% is a weakly increasing 
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sequence such that j € S implies i; < ij41. For N > k, {FQ, 5 : S C [k — 1]} is a basis 
of Qi. We have FQk.5 = Va:scsub(a) Ma and 8, = Vyegyryy) FQia\des(uy: The proof 
of the last formula uses a standardization bijection to convert semistandard tableaux 
to standard tableaux. This bijection relabels equal entries in a tableau T by a run of 
consecutive integers. 


Pfaffians. Let N be even. Given an N x N skew-symmetric matrix A (meaning A = — A), 


the Pfaffian of A is 
Pf£(A) = oS sgn(w) II A(wi, Wi41); 
wesPfn i odd 


where w € SPfy iff w € Sy, wi < wi4i, and w; < wi+2 for all odd i. We have det(A) = 
Pf(A)?. Each term of Pf(A) counts a signed, weighted perfect matching of a graph with 
vertex set {1,2,...,N}, where an edge from i to j (for i < j) is weighted by A(i, 7). 
There is a recursion Pf(A) = YyL2(-1)7 A(, J) Pf(A[[1, 7]]), where A[[1, j]] is the matrix 
obtained by deleting rows 1 and j and columns 1 and j of A. 


e Domino Tilings. For all m,n € Zyo with m even, the coefficient of x%y? in 


m/2 on 


mn/2 2anen {IT 2 coge ( ST 
2 ILI] x? COS (2) + cos a 


j=l k=1 


is the number of ways to tile an m x n board with a horizontal dominos and b vertical 
dominos. The steps in the proof are: (a) model domino tilings by perfect matchings of 
a grid-shaped graph; (b) use a Pfaffian to enumerate these signed perfect matchings; 
(c) adjust signs in the matrix so every perfect matching has sign +1; (d) rewrite the 
Pfaffian as the square root of the determinant of the matrix; (ce) evaluate the determinant 
by performing a similarity transformation that nearly diagonalizes the matrix, creating 
2 x 2 blocks running down the diagonal. Each 2 x 2 block contributes one of the factors 
in the product formula above. 


(ie 
Exercises 


12-1. Let ~ be the cyclic shift relation from §12.1. Find all the equivalence classes of ~ for: 
(a) the set of lattice paths ending at (3,4); (b) the set of lattice paths ending at (3,3). 
12-2. In this problem, we do not assume gcd(r,s) = 1. For v,w € R(N*E’), write v ~ w 
iff w can be obtained from v by a cyclic shift. Which of the following statements must be 
true for all r,s > 1? Explain. (a) Every equivalence class of ~ has size r + s. (b) Every 
equivalence class of ~ contains at least one path staying weakly above sx = ry. (c) Every 
equivalence class of ~ contains at most one path staying weakly above sx = ry. 

12-3. Fix h,k € Zo and m € Zs}. Show that the number of lattice paths from (0,0) to 
(k + mh, h) that never go below the line « = k + my is 


k+(m+1)h k+(m+1)h 
—m 
k+mh,h k+mh+1,h-1 


via a bijective proof analogous to the proof of the Dyck Path Rule 1.101 in 81.14. (Hint: 
Label each point (a, y) by the integer x — k — my and thereby divide the bad paths into m 
classes. Reflections do not work for m > 1, so look for other symmetries. ) 
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12-4. Verify the Chung—Feller theorem directly for n = 3 by drawing all lattice paths from 
(0,0) to (3,3) with: (a) 0 flaws; (b) 1 flaw; (c) 2 flaws; (d) 3 flaws. 

12-5. Let a be the Dyck path NNENEENNNENNENNEEENENEEE. Use the bijections 
from the proof of Theorem 12.4 to compute the associated lattice path with: (a) 5 flaws; 
(b) 8 flaws; (c) 10 flaws. 


12-6. For each flawed path 7, find the Dyck path associated to 7 via the bijections used to 
prove Theorem 12.4. 

(a) NENNEEEENENNNEENEENENNNE 

(b) NEEENNENEEENNNNNEENE 
12-7. Let 7 be arandom lattice path from (0,0) to (n,n), and for 1 < j < n, let X;(z) be 1if 
7 has a flaw in row j and 0 otherwise. Prove bijectively that P(X; = 0) = 1/2 = P(X; = 1). 
12-8. Let X1, Xo,...,Xp be independent random variables such that P(X; = 1) = 1/2 = 
P(X; = 0) for all i. (This means that, for all v1,...,un € {0,1}, the events X1 = v1, X2 = 
v9,..-,Xn = Un are independent as in Definition 1.66.) Compute P(X, + X24+---+X, =k) 
for k between 0 and n. Contrast your answer with the Chung—Feller theorem. 
12-9. Find a formula for the number of lattice paths from (0,0) to (n,n) with & flaws and 
j east steps departing from the line y = a. 


12-10. Compute the rook polynomial for each of the following partitions. (a) (3, 2,1) 
(b) (8,8, 8,8, 8,8, 8,8) (c) (n) (d) (n,n, 1*). 

12-11. Draw the diagrams of all integer partitions of 8 and determine which pairs of par- 
titions are rook-equivalent. 


12-12. Prove that for any integer partition yw, R,,(2) = Ry (2). 
12-13. (a) For any n > 1, prove that the partition jz consisting of n copies of n is rook- 
equivalent to the partition vy = (2n — 1,2n — 3,...,5,3,1). (b) Define a bijection between 
the set of non-attacking placements of k rooks on ys and the set of non-attacking placements 
of k rooks on v. 
12-14. Let y be an integer partition such that dg(u) C dg(An), where An = (N —1,N — 
2,...,3,2,1,0). Let the sequence (N — 1 — 41, N — 2 — pua,...,0 — wn) contain a, copies 
of k for each k > 0. (This sequence gives the row lengths of the skew shape Ay /j.) Prove 
that the number of partitions that are rook-equivalent to p is 
Il Qp—-1 tap—1 

ap-1—1,a% / 


k>1 


12-15. Let Ay = (N—-1, N-2,...,3,2,1,0). (a) Show that for all rook-equivalent partitions 
pand v, dg(u) C dg(Aw) iff dg(v) C dg(Ay). (b) Part (a) says that the set {yu : dg(y) C 
dg(Aj)} is a union of equivalence classes of the rook-equivalence relation. Determine the 
number of equivalence classes in this union. 


12-16. Show that for each integer partition ju, there is a unique integer partition v with 
distinct parts that is rook-equivalent to p. 


12-17. Given a non-attacking rook placement 7 on a Ferrers board F;,, we define a q-weight 
wt(z) as follows. Each rook in 7 cancels its own square, all squares above the rook in its 
column, and all squares left of the rook in its row. Let wt(a) be the number of uncanceled 
squares in F),. Define rg (u3q) = >, q**(™), where we sum over all non-attacking placements 
of k rooks on F,,. Find and prove a q-analogue of the factorization formula (12.1) where 
rz() is replaced by rp (f1; ¢). 
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12-18. Suppose jz is an integer partition with dg(w) C dg(Aw), where Ay = (N —1,N — 
2,...,2,1,0). (a) Use an involution to prove 


k 
re(u) = > S(N —i, N—k)(-1)*e,(N —1— jn, N — 2 pa,...,N-—N — uy), 
i=0 


where S(u,v) is a Stirling number of the second kind and e; is an elementary symmetric 
polynomial. (b) Use (a) to deduce a combinatorial proof of Theorem 2.64(d). (c) Use (a) 
to deduce that the multiset condition in Theorem 12.10 is sufficient for R,,(#) = R,(z). 
(d) Assume yp and v are rook-equivalent partitions. Use (a) and the Garsia—Milne Involution 
Principle (Exercise 4-88) to construct a bijection from the set of non-attacking placements 
of k rooks on F), to the set of non-attacking placements of k rooks on F,. 

12-19. For each labeled Dyck path in Figure 12.8, compute the associated parking function 
and tree (see Theorems 12.21 and 12.22). 

12-20. (a) Convert the parking function f in Example 12.13 to a labeled Dyck path 
and a tree. (b) Convert the labeled Dyck path NNENNEENEENENNEE with labels 
5,8,2,4,1,6,3,7 (from bottom to top) to a parking function and a tree. (c) Convert the 
tree 


T = ({0,1,...10}, {{0, 9}, {5, 7}, {5, 8}, {9, 4}, {7, 6}, {6, 9}, {7, 10}, {10, 1}, {3,9}, {2, 9}}) 


to a labeled Dyck path and a parking function. 


12-21. Suppose we represent a function f : {1,2,...,b} > {1,2,...,a@+1} as a labeled 
lattice path ending at (a,b). Find conditions on the labeled path that are equivalent to f 
being (a) surjective; (b) injective. 


12-22. (a) Given nonnegative integers c1,...,Ca+1 adding to b, how many labeled lattice 
paths from (0,0) to (a,b) have c; labels in column 7 for all 2? (b) Use the bijections in §12.5 
to translate (a) into enumeration results for parking functions and trees. 


12-23. (a) Let p, be the number of parking functions of order n. Give a combinatorial proof 


of the recursion 
es n—-1 
Pn = Ne m(" _ | PnP 


m=1 
(b) Use (a) and Exercise 3-74 to define a bijection between parking functions and trees. 


12-24. Let P, be the set of parking functions of order n. For f € Pp, define wt(f) = 
n(n + 1)/2— 1 fF). Let Pa(g) = Spep, T"). Prove the recursion 


12-25. Let S be a k-element subset of {1,2,...,n}. Prove that there are kn"—*~! parking 
functions f such that S = {a: f(x) = 1}. 


12-26. For each n,k,m € Zo, let Pnizjm be the set of labeled lattice paths ending at 
(k + mn,n) that never go below the line = k + my. Find a recursion satisfied by the 
quantities |Pr xml. 

12-27. Find a bijection between the set of parking functions of order n and the quotient 
group (Z,41)"/H, where H is the subgroup generated by (1,1,...,1). 

12-28. How many generators does an infinite cyclic group have? 
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12-29. Prove or disprove: if every proper subgroup of a finite group G is cyclic, then G 
itself must be cyclic. 


12-30. Suppose G is a group such that, for all d € Z>,, G has at most d elements x such 
that 2? = 1. Prove that every finite subgroup of G is cyclic. 


12-31. Describe all the finite subgroups of the field C. 


12-32. Quaternions. Let H be a four-dimensional real vector space with basis 1, i,j, k. 
Define multiplication on H by letting 1 act as the identity, setting i? = j? = k? = —-1, 
ij = k = —ji, jk = i = —kj, ki = j = —ik, and extending by linearity. (a) Show that H 
with this multiplication is a division ring (i.e., H satisfies all the axioms in the definition 
of a field except commutativity of multiplication). (b) Find a non-cyclic finite subgroup of 
the multiplicative group H — {0} (cf. Theorem 12.29). (c) Show that the equation 2? = —1 
has infinitely many solutions in H. 


12-33. Prove that the product of all the nonzero elements in a finite field F' is —1p. Deduce 
Wilson’s Theorem: for p prime, (p — 1)! = —1 (mod p). 

12-34. Compute the number of monic irreducible polynomials of degree 12 over a 9-element 
field. 


12-35. (a) Enumerate all the irreducible polynomials in Z{2] of degree at most 5. (b) Use 
the formula in Theorem 12.30 to compute I(n,2) for 1 <n < 8 (compare with the results 
in (a) for 1 <n <5). 

12-36. Construction of Finite Fields. Let F be a field with q elements, let h € Fa] 
be a fixed monic irreducible polynomial of degree n, and let 


K={f € Fla]: f =0 or deg(f) < n}. 


For f,g € K, define f + g to be the sum of these polynomials in F[z], and define f ® g to 
be the remainder when fg is divided by h. Show that K, with these operations, is a field 
of size q”. The field K is denoted F'[a]/(h). 


12-37. Let K = Zg[z]/(x? + 2 +1) (see the previous exercise). Construct addition and 
multiplication tables for A. Explicitly confirm that the multiplicative group A —{0} is 
generated by x by computing 2’ for 1 <i <7. 

12-38. Let h = c+ +2+1 € Zo[z], and let K = Zo[zx]/(h), which is a 16-element field. 
(a) Explain why every element y € K satisfies y'® = y. (b) List all the elements of K 
and their minimal polynomials over Zz. (c) Factor the polynomial 2'® — x € Z[z] into 
a product of irreducible polynomials. (d) Explain the relation between part (b), part (c), 
and the formulas in Theorem 12.30. (e) Find all generators of the cyclic group of nonzero 
elements of K. 

12-39. (a) Use Theorem 12.30 to show that I(n,q) > 0 for all prime powers q and all n > 1. 
(b) Prove that for every prime power p”, there exists a field of size p”. 

12-40. Let F be a finite field of size g. A polynomial h € F[a] is called primitive iff h is a 
monic irreducible polynomial such that x is a generator of the multiplicative group of the 
field kK = F{a]/(h). (a) Count the primitive polynomials of degree n in F'[a]. (b) Give an 
example of an irreducible polynomial in Z{2] that is not primitive. 

12-41. Let K bea q-element field. How many nxn matrices with entries in K are: (a) upper- 
triangular; (b) strictly upper-triangular (zeroes on the main diagonal); (c) unitriangular 
(ones on the main diagonal); (d) upper-triangular and invertible? 

12-42. How many 2 x 2 matrices with entries in a g-element field have determinant 1? 
12-43. Count the number of invertible n x n matrices with entries in a qg-element field FP’. 
How is the answer related to [n]!q? 
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12-44. How many three-dimensional subspaces does the vector space Z? have? 


12-45. For each integer partition jz that fits in a box with two rows and three columns, 
draw a picture of the RREF matrix associated to yz in the proof of Theorem 12.37. 

12-46. Find the RREF basis for the subspace of Z2 spanned by v1 = (1,4,2,3,4), v2 = 
(2,3,1,0,0), and v3 = (0,0,3, 1,1). 

12-47. Find the RREF basis for the subspace of Z§ spanned by v1 = (0,1,1,1,1,0), v2 = 
(1,1,1,0,1,1), and v3 = (1,0, 1,0, 0,0). 

12-48. Let V be an n-dimensional vector space over a field kK. A flag of subspaces of 
V is a chain of subspaces V = Vo D Vy D Vy D--: D Vs = {0}. Suppose |K] = ¢. 
Given nj,...,n; and n = nj +---+ns, count the number of such flags in V such that 
dimx (V;-1) — dimx (V;) = n,; for i between 1 and s. 


12-49. (a) Give a linear-algebraic proof of the symmetry property HP = eel] : when q is 


n 


a prime power. (b) Explain how the equality of formal polynomials [a = Pe k 


deduced from (a). 


12-50. Let V be an n-dimensional vector space over a qg-element field, and let X be the poset 
of all subspaces of V, ordered by inclusion. Show that the Mobius function of X is given 
by ux(W,Y) = (-1)%4q"(¢-))/? if W CY and d= dim(Y) — dim(W), and px(W,Y) = 0 if 
W ZY. (Use Exercise 8-12.) 

12-51. Use the recursions for a, and b,, in §12.8 to verify the values in (12.3). 


12-52. Give probabilistic interpretations for the rational numbers appearing as coefficients 
in the Maclaurin series for tanz and sec 2. 


12-53. Fill in the details of Step 5 of the proof in §12.8. 

12-54. (a) List the permutations satisfying (12.4) for n = 1,3,5. (b) List the permutations 
satisfying (12.5) for n = 0, 2,4. 

12-55. (a) Develop ranking and unranking algorithms for up-down permutations. (b) Un- 
rank 147 to get an up-down permutation in $7. (c) Find the rank of 25364817 among 
up-down permutations of length 8. 


| can be 
q 


12-56. (a) Develop successor algorithms for up-down permutations. (b) Find the successor 
of 25364817 among up-down permutations of length 8. (c) Find the successor of 385927164 
among up-down permutation of length 9. 

12-57. Let (q;q)o = 1 and (q;¢)n = (1— g)(1—@)--:(1—@”) for n € Zs . Consider the 
following formal g-analogues of trigonometric functions: 


oo okt °° gal 
sing 2 = S°(-1)' ——_-; COS, X =S>(-1 ‘ 
k—0 (95 9) 2k-+1 k=0 (4; 4) 2k 
tang © = sing x/ cosy 2; SeCg © = 1/ cosy x. 


Define q-tangent numbers t, and q-secant numbers s,, by 
co co 


t 8 
tangx = "2";  secgx = 5 “a”, 


(Gn (Gn 


(a) Show that for each n € Zso, 


tn _ x ge), Sn = a gv), 


w satisfying (12.4) w satisfying (12.5) 
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(b) Use (a) to conclude that t, and s, are polynomials in q with nonnegative integer 
coefficients. Compute t,, for n = 1,3,5 and s,, for n = 0, 2,4. 

BO 
C D 
kxk, Cis (n—k)xk, Dis (n—k) x (n—k), and O denotes a k x (n — k) block of zeroes. 
Prove that det(A) = det(B) det(D). 


12-59. Algorithmic Complexity of Determinant Evaluation. Let A € M,,(R). 

(a) How many additions and multiplications in R are needed to compute det(A) directly 
from Definition 12.40? (b) How many additions and multiplications in R are needed to 
compute det(A) recursively, using Theorem 12.51? (c) Explain how to use Theorems 12.44 
and 12.50 to compute det(A) efficiently (using about cn? operations in R for some constant 
C). 

12-60. Permanents. The permanent of an n x n matrix A € M,,(R) is per(A) = 
wes, Hii: Ali, w(i)). Prove the following facts about permanents: (a) per(A™) = per(A); 
(b) if A is diagonal, then per(A) = []j_, A(é,2); (c) per(In) = 1; (d) per(A) is an R- 
multilinear function of the rows and columns of A (cf. Theorem 12.48); (e) if B is obtained 
from A by permuting the rows in any fashion, then per(B) = per(A). 


12-58. Suppose an n x n matrix A is given in block form as A = | where B is 


12-61. State and prove versions of the expansions in Theorem 12.51 for computing perma- 
nents. 


12-62. Verify the characterization of linear maps stated in Example 12.47. 
12-63. Complete the proof of Theorem 12.53 by showing that (adj A)A = det(A)In. 


12-64. Cramer’s Rule. Let A € M,,(R) where det(A) is invertible in R, let b be a given 
n x 1 vector, and let x = [a --- x]. Show that the unique solution of the linear system 
Ax = b is given by x; = det(A;)/det(A), where A; is the matrix obtained from A by 
replacing the ith column by b. 


12-65. Verify the Cauchy—Binet Theorem by hand computation for the matrices 


2 10 8 ea 

A-|]1 -1 1 2 B= 
4 021] a 
—2 0 -1 


12-66. Consider a function w : {1,2,...,k} — {1,2,...,n}, which we regard as a word 
WwW = W,W2:+:wr. Show that there exist basic transpositions t1,...,tm © Sz such that 
wo(t t2-+-tm) is a weakly increasing word, and the minimum possible value of m is inv(w). 
12-67. Let A and B be n x n matrices. Prove that det(AB) = det(A) det(B) by imitating 
(and simplifying) the proof of the Cauchy—Binet Theorem 12.56. 

12-68. Given an m x n matrix A and an n x m matrix B with m > n, what is det(AB)? 
Explain. 

12-69. Let 7 be the tournament with edge set {(2,1), (1,3), (4,1), (1,5), (6, 1), (2,3), (4, 2), 
(5,2), (2,6), (3, 4), (3, 5), (6,3), (4,5), (6, 4), (6, 5)}. Compute wt(r), inv(r), and sgn(r). Is r 
transitive? 

12-70. Let 7 be the tournament in Example 12.60, and let J be the involution used to prove 
Theorem 12.65. Compute 7’ = I(r), and verify directly that wt(r’) = wt(r), sgn(7’) = 
—sgn(7), and I(r’) =r. 

12-71. Use induction and Theorem 12.50 to give an algebraic proof of Theorem 12.65. 
12-72. Suppose %p,21,...,@N are distinct elements of a field F’. State why the Vandermonde 
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matrix [xh lo<ij<n is invertible. Use this to prove that if p € F'[a] has degree at most N 
and satisfies p(x;) = 0 for 0 <i < N, then p must be the zero polynomial. 

12-73. A king in a tournament T is a vertex v from which every other vertex can be reached 
by following at most two directed edges. Show that every vertex of maximum outdegree in 
a tournament is a king; in particular, every tournament has a king. 

12-74. Use the Hook-Length Formula to compute f* for the following shapes 2: (a) (3, 2, 1); 
(b) (4,4, 4); (c) (6,3, 2, 2,1, 1,1); (d) (n,n — 1); (e) (a,1). 

12-75. Show that f© = 1 and, for all nonzero partitions A, f* = ar f" where we sum 
over all 4 that can be obtained from A by removing some corner square. Use this recursion 
to calculate f> for all \ with |\| < 6. 

12-76. (a) Develop ranking and unranking algorithms for standard tableaux of shape 
based on the recursion in Exercise 12-75. (b) Unrank 46 to get a standard tableau of shape 


(4,3, 1). (c) Rank the standard tableau 
[2|5[8} 
6] 7 


12-77. (a) Develop successor algorithms for standard tableaux of shape A based on the 
recursion in Exercise 12-75. (b) Find the successors of the three standard tableaux of shape 
(4, 2,2,1) shown in the Introduction of this book. (c) Find the successor of the standard 
tableau in part (c) of the previous exercise. 

12-78. Enumerate all the hook walks for the shape \ = (4,3,2,1) that end in the cor- 
ner cell (2,3), and compute the probability of each walk. Use this computation to verify 
Theorem 12.73 in this case. 

12-79. Suppose A € Par(p) where p is prime. (a) Show that p divides f* if \ is not a hook 
shape (a hook shape is a partition of the form (a,1?~*)). (b) Compute f* mod p if \ is a 
hook shape. 

12-80. Does the Hook-Length Formula extend to enumerate standard tableaux of skew 
shape? Either adapt the probabilistic proof to this situation, or find the steps in the proof 
that cannot be generalized. 

12-81. Confirm that =p and =x are equivalence relations on [N]*, as asserted in §12.13. 
12-82. Let T be the tableau in Example 12.81. Find an explicit chain of elementary Knuth 
equivalences demonstrating that rw(T)1 =x rw(T < 1). 


12-83. Find the length of the longest increasing and decreasing subsequences of the word 
w = 4135321462731132423142. 


12-84. Complete the proofs of Theorems 12.83 and 12.84. 


12-85. For any semistandard tableau T, prove that P(rw(T)) = T. Show that the set of 
reading words of semistandard tableaux intersects every Knuth equivalence class in exactly 
one point. 

12-86. Prove Theorem 12.89 without using the RSK algorithm. 

12-87. (a) Show that each Q*, is a real vector space. (b) Prove: for all f € Q&, and g € Q’, 
fg € Qyr™. Conclude that Quy is a subalgebra of R[z1,..., ay]. 

12-88. List all compositions of 4, all bit strings in {0,1}%, and all subsets of [3]. Show how 
these objects correspond under the bijections described in the text. 

12-89. Assume a € Compy(k) has s parts. (a) How many monomials appear in 
Ma(21,---,n)? (b) How many monomials appear in FQ; sup(a)(21,---,2N)? 


? 
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12-90. Prove Theorem 12.94. 
12-91. Prove: for all \ € Parn(k), 


m(a1,...,2N) = y MolPiy.<s5%w)s 


a€Comp jy (k): 
sort(a)=A 


Conclude that AK C Qk. 


12-92. Let N = k = 4. For all S C {1,2,3}, write FQ, ¢ as a sum of monomials in the 
a-variables and as a linear combination of the My. 


12-93. For all partitions A € Par(5), write s, as an explicit linear combination of funda- 
mental quasisymmetric polynomials. 


12-94. For each semistandard tableau T of shape (4,3) and content (2,2,2,1), find the 
image of T under the standardization map F' from the proof of Theorem 12.99. 


12-95. Describe all semistandard tableaux that standardize to this standard tableau: 


1216) 
41819] 


12-96. Does Theorem 12.99 extend to skew Schur polynomials s)/,,? 


12-97. For k > 1, let Aj, be the matrix with rows and columns indexed by subsets of [k — 1], 
such that the entry in row S and column T is 1 if S C T and 0 otherwise. Order the rows 
and columns of A; by identifying a subset S with a bit string of length & — 1 and regarding 
this string as an integer written in binary. For example, when k = 5 and S = {1,2,4}, 
the associated bit string is 1101, which is the integer 13. (a) Find the entries in A; for 
1<k< 4. (b) How is Az4, related to Ay? (Describe Axi in block form.) (c) Find Ae 
for 1<k <4. (d) How is AL related to A,'? 

12-98. Assume N > k and a € Comp(k). Find a formula expressing M, as an explicit 
linear combination of the basis {FQ,, 5 : S C [k — 1}. 


12-99. Suppose Y is a set and G: ZR — Y is a surjective function. For each y € Y, define 
fy € Rix,...,@N] to be the sum of all monomials x® such that G(a) = y. (a) Prove that 
{fy : y € Y} is linearly independent. (b) Define f € R[x1,...,an] to be G-symmetric iff 
for all a, 8 € ZX, such that G(a) = G(8), the coefficients of x* and x® in f are the same. 


Show that {f, : y € Y} is a basis for the vector space of G-symmetric polynomials in N 
variables. (c) Explain how Theorems 9.23 and 12.94 are special cases of this exercise. 


12-100. Multiplication Rule for Fundamental Quasisymmetric Polynomials. 
Given a word w = w W2---W, containing the symbols 1,2,...,n once each, and given a 
word v = U1 v2+++Um containing the symbols n+1,n+2,...,2-+m once each, let Shuf(w, v) 
be the set of words u = uyU2+++Un+m that contain w and v as subsequences; wu is called a 
shuffle of w and v. Show that 


PO ws nestai PO a westu) = > FQ 4:00; Dea(u) : 
u€Shuf(w,v) 


12-101. Show that if A is an N x N skew-symmetric matrix with N odd, then det(A) = 0. 


12-102. Verify by direct calculation that det(A) = (af + cd — be)? for the 4 x 4 matrix A 
in Example 12.101. 


12-103. Find the Pfaffian of a general 6 x 6 skew-symmetric matrix. 
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12-104. Count the number of perfect matchings for the graph shown in Figure 12.13. 
12-105. Let G be the simple graph with V(G) = {1,2,3,4,5,6} and 


E(G) = {{2, 3}, {3,4}, {4,5}, {2, 5}, {1 2}, (1, 5}, {3, 6}, (4, 6F, {2, 4h}. 


Find all perfect matchings of G. Use this to compute )?)yyepmq) Sgn(M) wt(M), and verify 
your answer by evaluating a Pfaffian. 


12-106. Compute the images of the following permutations in S%? under the bijec- 
tion g : Sh’ > SP£%, used in the proof of Theorem 12.112: (a) (3,1,5,7)(2, 4,8, 6); 
(b) (1, 4)(2, 3)(5, 7)(6, 8); (c) (2,5, 1,6, 8,4, 7,3); (d) (3,2, 1,5, 6, 7)(4, 8). 

12-107. Compute the images w of the following pairs (u,v) € SPf under the bijection 
g ! : SPf — S&? used in the proof of Theorem 12.112: (a) u = 13254768, v = 15283647; 
(b) wu = 13254768, v = 12374856; (c) u = 15243867 = v. In each case, confirm that the term 
indexed by w in det(A) equals the term indexed by (u,v) in Pf(A)?. 

12-108. Compute the exact number of domino tilings of a 10 x 10 board and a 6 x 9 board. 
12-109. How many domino tilings of an 8 x 8 board use: (a) 24 horizontal dominos and 8 
vertical dominos; (b) 4 horizontal dominos and 28 vertical dominos? 

12-110. Complete Example 12.116 by writing the full word w and showing that the place- 
ment of every new domino never causes sgn*(/) to become negative. 

12-111. Verify properties (a) through (d) of tensor products of matrices stated after Defi- 
nition 12.117. 

12-112. Prove Lemma 12.118. 

12-113. Let U; be the matrix defined in Lemma 12.118. Show that ,/2/(k+1)U; is a 
unitary matrix (i.e., U~! = U*, where U* is the conjugate-transpose of U). 

12-114. (a) Prove that, for all even m > 0, 


m/2 21m 1_ofy — y2|rri 
I i sceGceiyj=. ee eee a saeco ae 


(b) Deduce that no 2cos(jr/(m+1))=1. 


j=l 
12-115. Show that formula (12.16) simplifies to 


m/2n/2 


mn/2 2 2 us 3 Z — | 
2 ITI [+ cos (2) +u cos t= for n even; 
m/2(n—1)/2 


m(n—1)/2,,m/2 ee ju 2,2 _kn 
2 x II U E cos (=) + y* cos ( for n odd. 


Notes 


§12.1. Detailed treatments of the theory of lattice paths may be found in [89, 93]. §12.2. 
The Chung-Feller Theorem was originally proved in [22]; the bijective proof given here is 
due to Eu, Fu, and Yeh [30]. §12.3. There is a growing literature on rook theory; some of 
the early papers on this subject are [36, 48, 49, 69]. §12.4. More information about parking 
functions may be found in [34, 38, 76, 119]. §12.6. Expositions of field theory may be found 
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in [65] or Chapter 5 of [18]. An encyclopedic reference for finite fields is [78]. §12.7. For more 
on Gaussian elimination and RREF matrices, see linear algebra texts such as [62]. §12.8. 
The combinatorial interpretation of the coefficients of the tangent and secant power series 
is due to André [2, 4]. For more information on g-analogues of the tangent and secant series, 
see [6, 7, 32]. §12.11. Many facts about matrices and determinants, including the Cauchy— 
Binet Formula, appear in the matrix theory text by Lancaster [77]. The text [92] gives a 
thorough account of tournaments. The combinatorial derivation of the Vandermonde Deter- 
minant is due to Gessel [45]. §12.12. The probabilistic proof of the Hook-Length Formula 
is due to Greene, Nijenhuis, and Wilf [55]. §12.13. A discussion of Knuth equivalence and 
its connection to the RSK correspondence appears in [72]. Theorem 12.84 on disjoint mono- 
tone subsequences was proved by Curtis Greene [54]; this generalizes Schensted’s original 
result [118] on the size of the longest increasing subsequence of a word. §12.16. Our treat- 
ment of the domino tiling formula closely follows the presentation in Kasteleyn’s original 
paper [70]. 
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Appendix: Definitions from Algebra 


This appendix reviews some definitions from abstract algebra and linear algebra that are 
used in certain parts of the main text. 


(MR 


Rings and Fields 


This section defines abstract algebraic structures called rings and fields. (Groups are studied 
in detail in Chapter 7.) Our main purpose for introducing these concepts is to specify the 
most general setting in which certain algebraic identities are true. 


Definition of a Ring. A ring consists of a set R and two binary operations + (addition) 
and - (multiplication) with domain R x R, subject to the following axioms. 


Va,yER, cr+yeR 

Ve,y,z€R, e+(y+z)=(e@+y)+2 
Va,yER, e+y=yta 
AOR ER, Vx eC Rix +0R=x=O0R4+2 

Va € R,d-x € Rix + (-2#) =0OR=(-2) +2 
Ve,yER, c-yeR 

Va,y,z€R, «-(y-z)=(a-y)-z 

Alpe R, Vr Ee R,x-lre=rx=l1pR-2 
Ve,y,z€R, e-(ytz)=a-ytau-z 
Va,y,z€R, (e@+y)-2=au-zt+y-z 


closure under addition) 
associativity of addition) 
commutativity of addition) 
existence of additive identity) 
existence of additive inverses) 
closure under multiplication) 
associativity of multiplication) 
existence of multiplicative identity) 
left distributive law) 

right distributive law) 


pa ee a pF BR BE GP 


We often write xy instead of x-y. R is a commutative ring iff R satisfies the additional 
axiom 
Va,ye R, ry = yu (commutativity of multiplication). 


For example, Z (the set of integers), Q (the set of rational numbers), R (the set of real 
numbers), and C (the set of complex numbers) are all commutative rings under the usual 
operations of addition and multiplication. For n > 0, the set Z, = {0,1,2,...,n— 1} of 
integers modulo n is a commutative ring using the operations of addition and multiplication 
mod n. For n > 1, the set M,,(R) of n x n matrices with real entries is a non-commutative 
ring. 

Definition of an Integral Domain. An integral domain is a commutative ring R such 
that 1g 4 Op and RF has no zero divisors: 


Vae,y © R,xy =0rR > xc =O0pR or y= Op. 


For example, Z is an integral domain. Zg is not an integral domain, since 2 and 3 are nonzero 
elements of Zg whose product in this ring is 0. 
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Definition of a Field. A field is a commutative ring F’ such that 1p # Or and every 
nonzero element of F has a multiplicative inverse: 


Vae Fx A0r> dye Fyry=lp=ye. 


For example, Q, R, and C are fields, but Z is not a field. One can show that Z,, is a field iff 
n is prime. 

Let R be a ring, and suppose 71,%2,...,% € R. Because addition is associative, we 
can unambiguously write a sum like x1 + 72 +23 +---+2, without parentheses. Similarly, 
associativity of multiplication implies that we can write the product 7122---x,, without 
parentheses. Because addition in the ring is commutative, we can permute the summands 
in asum like 71+%2+---+2, without changing the answer. More formally, for any bijection 
f :{1,2,...,n} > {1,2,...,n} and all 21,...,2, € R, we have 


Bay + Bye) Fo + Bony = Bi + La +-+ + Ep. 


It follows that if {a; : i € I} is a finite indexed family of ring elements, then the sum of 
all these elements (denoted }°,-; x;) is well-defined. Similarly, if A is a finite subset of R, 
then )°..<4 2 is well-defined. On the other hand, the products [],-<; 7; and [],<4 x are not 
well-defined (when R is non-commutative) unless we specify in advance a total ordering on 
TI and A. 

Let F be a field with additive identity 07 and multiplicative identity 1p. It sometimes 
happens that there exist positive integers n such that n.lp (the sum of n copies of 1) 
is equal to Or. The characteristic of F' is defined to be the least n > 0 such that n.lp = 
Or; if no such n exists, the characteristic of F is zero. For example, Q, R, and C are 
fields of characteristic zero, whereas Z, (the integers modulo p for a prime p) is a field 
of characteristic p. It can be shown that the characteristic of every field is either zero or 
a prime positive integer. When F' has characteristic zero, all the field elements n.1 (for 
n € Zso) are nonzero and hence invertible in F’. 


DT 


Vector Spaces and Algebras 
Next we recall the definitions of the fundamental algebraic structures of linear algebra. 


Definition of a Vector Space. Given a field F’, a vector space over F' consists of a set 
V, an addition operation + with domain V x V, and a scalar multiplication - with domain 
Fx V, which must satisfy the following axioms. 


Va,yEeV, e+yEeV 
Va,y,z2€V, o+(ytz)=(e+y)+z2 
Va,yeEV, e+y=yte 


(closure under addition) 
(associativity of addition) 
(commutativity of addition) 
J0y EV, Vee Vi2+0v =x =0v +2 (existence of additive identity) 
Va €V, 4-2 €V, 2+ (—-x) =0y = (-2) +2 (existence of additive inverses) 
VeoE FWeEV, c-vEeV (closure under scalar multiplication) 
( 
( 
( 
( 


Yee FiVYu,w EV, c-(u+w) =(c-v) +(c-w) (left distributive law) 

Ve,d€ Fu EV, (c+ d)-v=(c-v)+(d-v) — (right distributive law) 

Ve,d€ F\Vu eV, (cd)-v=c- (d-v) associativity of scalar multiplication) 
WeEV, l-v=v identity property) 


When discussing vector spaces, elements of V are often called vectors, while elements of F 
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are called scalars. For any field F’, the set F” = {(a1,...,%n) : v; € F'} is a vector space 
over F' with operations 


(@1,---,2n) + (Y1,---5 Yn) = (41 + Y1,---,2n + Yn); CO Biss cag) = (Cb ha 2 5CEn) 


Definition of an Algebra. Given a field F’, an (associative) algebra over F is a set A that 
is both a ring and a vector space over F’ such that the ring multiplication (denoted v e w) 
and the scalar multiplication (denoted c- v) satisfy this axiom: 


Vee FVu,w € Aye: (vew) = (c:v)ew=ve(c-w). 


For example, given any field F’,, let F'[a] be the set of all formal polynomials ap + aya +---+ 
a,az* where all coefficients ag, a1,...,a@% come from F’. This is a commutative algebra over F 
(called the one-variable polynomial ring with coefficients in F’) using the standard operations 
of polynomial addition, polynomial multiplication, and multiplication of a polynomial by a 
scalar. Some relatives of this algebra appear in the main text when we study formal power 
series and Laurent series. An example of a non-commutative algebra is the set M,,(IR) of 
n xX n real-valued matrices, where n > 1. More generally, for any field F’, the set M,,(F’) 
of n X n matrices with entries in F' is an algebra over F' using the usual formulas for the 
matrix operations. 

We can also consider algebras where the field F’ of scalars is a replaced by any commu- 
tative ring R. For example, the set of polynomials R[x1,...,x2y] is a commutative algebra 
over R, and the set of matrices M,,(R) is a non-commutative algebra over R when n > 1 
and |R| > 1. 

Subgroups of groups are defined and studied in 87.5. Here we define the analogous 
concepts for other algebraic structures: subrings, ideals, subspaces, and subalgebras. In 
each case, we are looking at subsets that are closed under the relevant algebraic operations. 


Definition of Subrings and Ideals. Let R be a ring. A subring of R is a subset S of R 
such that 0g € S, lr € S, and for all z,y € S, x+y and —a and zy are in S. An ideal of 
Ris a subset J of R such that Or € J and for all x,y € J and allr © R, x+y and —a and 
rz and xr are in I. 


Definition of Subspaces. Let V be a vector space over a field F'. A subspace of V is a 
subset W of V such that Oy € W and for all z,y © W and allce€ F, x+y and cz are in 
W. 


Definition of Subalgebras and Ideals. Let A be an algebra over a field (or commutative 
ring) F'. A subalgebra of A is a subset B of A such that 04 € B, 14 € B, and forall z,y € B 
andalle€ F,xa+yand vey andc-2z are in B. An ideal of A is a subset I of A such that 
O4 € J and for all x,y € J, all z € A, andallc€ F,x+yandc-a and zexz and rez are 
in I. 


A graded vector space is a vector space V together with subspaces V,, for each integer 
n > 0, such that V is the direct sum @Bn>o Vn. This means that for every v € V, there 
exist unique vectors v, € V, such that all but finitely many v, are zero, and v = Pash Dine 
A graded algebra is an algebra A and subalgebras A, such that A = @,,., An as vector 
spaces, and whenever n,m > 0, v € An, and w € Am, we have vew € Anim. For example, 
the polynomial algebra A = R{a1,..., 2] is a graded algebra where A,, is the subspace of 
all homogeneous polynomials of degree n (including zero). In 89.28, we study the algebra 
of symmetric functions A = K[pm : m > Oj, where A, is the subspace spanned by the 
power-sum symmetric functions p, as 1 ranges over integer partitions of n. 
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Homomorphisms 


This section defines homomorphisms for various kinds of algebraic structures. Group homo- 
morphisms are studied in the main text (see §7.7). 


Definition of a Ring Homomorphism. Given rings R and S, a ring homomorphism is 
a function f : R > S such that f(1z) = 1g and for all z,y € R, f(x +y) = f(x) + fly) 
and f(x-y) = f(x)- fy). 


It automatically follows from these conditions that f(0r) = 0s, f(nx) = nf(x) for all 
x € Rand n€ Z (see Theorem 7.56) and f(a") = f(a)” for alla € Rand n € Zyo. Not all 
texts require that ring homomorphisms preserve the multiplicative identity. 


Definition of a Vector Space Homomorphism. Suppose V and W are vector spaces 
over a field F’. A vector space homomorphism (also called a linear map or linear trans- 
formation) is a function T : V — W such that for all v,w € V and all c € F, 
T(ut+tw) =T(v) +T(w) and T(cv) = cT(v). 


It automatically follows from these conditions that T preserves linear combinations: for 
all v1,...,Un € V and all c1,...,¢n € Fy T(civr +++ + envn) = aT (v1) +--+ + enT (Un). 


Definition of an Algebra Homomorphism. Suppose A and B are algebras over a field 
F. An algebra homomorphism is a function T : A — B that is both a ring homomorphism 
and a vector space homomorphism. 


Here is an important example of an algebra homomorphism. Let F' be any field, and 
let A = F[z] be the algebra of polynomials in the formal variable z with coefficients in F’. 
Suppose B is any algebra over F and c is any element of B. One may check that there exists 
a unique algebra homomorphism E, : F'|z] > B such that E.(z) = c, given explicitly by 


E.(ao + 12 +++++4n2") =an-lpt+aiet+:::+anc” for all ao,...,an € F. 


The map £, is called the evaluation homomorphism sending z to c. More generally, given 
any ordered list c,,...,c, of elements of B, there exists a unique algebra homomorphism 
E: Fla,...,2n] > B such that E(z;) = c; for all i between 1 and k. These evaluation 
homomorphisms often arise in algebraic combinatorics, especially in the theory of symmetric 
functions. 

Let A and B be algebraic structures of the same type (e.g., two rings, two vector spaces, 
or two algebras), and let g : A > B be a homomorphism for that type of algebraic structure. 
The kernel of g, denoted ker(g), is the set {v € A: g(a) = Op}, where Og is the additive 
identity element of B. The image of g, denoted img(g), is the set {y € B: dx € A,y = g(a)}. 
One can show that g is one-to-one iff ker(g) = {0.4}, and g is onto iff img(g) = B. 

If A and B are rings, then ker(g) is an ideal of A and img(g) is a subring of B. If A and 
B are vector spaces, then ker(g) is a subspace of A and img(g) is a subspace of B. If A and 
B are algebras, then ker(g) is an ideal of A and img(g) is a subalgebra of B. Continuing 
the example above, consider the evaluation homomorphism E : F'[z1,..., 2%] > B such that 
E(z;) = cj for i between 1 and k. The image of E is the subalgebra of B generated by 
C1,---,Ck, which is the set of all finite F-linear combinations of finite products of c1,..., Cx. 
The kernel of £ is the set of all formal polynomials p(z1,..., 2%) such that p(c1,...,¢r) = 0. 
Each p € ker(£) represents an algebraic relation between the elements ci,...,cz. If E 
consists of zero alone, we say that c1,...,c, is an algebraically independent list. (See §9.14 
for more discussion.) 
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Linear Algebra Concepts 


Here we review some concepts from linear algebra that lead to the idea of the dimension of 
a vector space. Throughout, let V be a fixed vector space over a field F’. 


Definition of Spanning Sets. A subset S of the vector space V spans V iff for every 
uv € V, there exists a finite list of vectors v1,...,vz € S and scalars cy,...,cp € F with 
Vv = cv, +--+ + ceug. Any expression of the form c,v, +--+: + cgug is called a linear 
combination of v1,...,Ux. A linear combination must be a finite sum of vectors. 


Definition of Linear Independence. A finite list (v1,..., vx) of vectors in V is called 
linearly dependent iff there exist scalars cj,...,cx € F such that cv, +---+ chug, = Oy and 
at least one c; is not zero. Otherwise, the list (v1,...,v%) is called linearly independent. A 
set S' C V (possibly infinite) is linearly dependent iff there is a finite list of distinct elements 
of S that is linearly dependent; otherwise, S' is linearly independent. 


Definition of a Basis. A basis of a vector space V is a set S C V that is linearly 
independent and spans V. 


For example, define €; € F'” to be the vector with 1p in position 7 and Of in all other 
positions. Then {é],...,é,} is a basis for F". Similarly, one may check that the infinite set 
S = {1,2,27,23,...,2",...} is a basis for the vector space V = F[z] of polynomials in x 
with coefficients in F’. S spans V since every polynomial must be a finite linear combination 
of powers of x. The linear independence of S follows from the definition of equality of formal 
polynomials: the only linear combination col + cyx + cox? + --- that can equal the zero 
polynomial is the one where cg = cy = cg =::: = Op. 

We now state without proof some of the fundamental facts about spanning sets, linear 
independence, and bases. 


Linear Algebra Facts. Every vector space V over a field F' has a basis (possibly infinite). 
Any two bases of V have the same cardinality, which is called the dimension of V and 
denoted dim(V). Given a basis of V, every v € V can be expressed in exactly one way 
as a linear combination of the basis elements. Any linearly independent set in V can be 
enlarged to a basis of V. Any spanning set for V contains a basis of V. A set S C V with 
|S| > dim(V) must be linearly dependent. A set T C V with |T| < dim(V) cannot span V. 


For example, dim(F”) = n for all n > 1, whereas the polynomial ring F'[:] is an infinite- 
dimensional vector space. 
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degree, 112, 148 
degree multiset, 112 
degree of a monomial, 366 
degree of a polynomial, 482 
degree sum formulas, 147 
density of K[z] in K|[z]], 509 
derangements, 166, 183, 204 
recursions for, 167 
summation formula, 166 
derivative of k-fold products, 512 
DES (Data Encryption Standard), 39 
descent set, 334, 352 
descent statistic, 334 
determinants, 539, 563 
alternating property, 541 
and algebraic independence, 389 
and elementary row operations, 542 
and Pfaffians, 567 
and transposes, 540 
Cauchy—Binet Theorem, 544 
evaluating, 583 
Laplace expansions, 542 
multilinearity, 541 
of diagonal matrices, 540 
of identity matrices, 540 
of triangular matrices, 540 
product formula, 546 
properties, 576 
Vandermonde, 548 
diagrams of partitions, 75, 549 
dice rolling, 16, 18, 22 
Difference Rule, 15 
differential equations and formal series, 514 
digraphs, 104, 148 
balanced, 144 
connected, 119 
simple, 105 
dihedral groups, 291 
dimension, 593 


Index 


discarding cards in poker, 21 
Disjoint Union Rule, 15 
divide-and-conquer algorithms, 237 
divisibility posets, 177, 184 
Dobinski’s Formula, 235 
domain of a function, 25 
dominance ordering, 370 
domino tilings, 570, 578 
enumeration formula, 570 
weight of, 570 
dominos, xxiii 
dual bases, 412, 420 
characterization of, 412 
of A‘, 413 
duality of e,, and fgt,, 414 
Dyck Path Rule, 35 
Dyck paths, 34, 347 
g-analogues, 352 
and major index, 348 
area of, 347 
bounce statistic, 357 
counting, 517 
for slope s/r, 515 
inversions, 347 
labeled, 526 
major index, 347 
rational slope, 576 
weighted recursion, 348 
Dyck words, 347 


edge collapsing, 135 
EGF Product Rule, 214 
elementary basis of Ay, 419 
elementary basis of A‘,, 387 
elementary Knuth relations, 554 
elementary symmetric polynomials, 364 
algebraic independence, 389 
as skew Schur polynomials, 415 
definition, 365 
generating function for, 392 
monomial expansion, 387 
power-sum expansion, 397 
recursion for, 390, 393 
recursion involving pz, 394 
Schur expansion, 387 
summary of facts about, 419 
empty word, 7 
enumerative combinatorics, xix 
equality of formal power series, 481 
equality of words, 7 
equivalence classes, 82 
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equivalence relations, 82 forgotten basis of Ay, 399 
Erdés—Szekeres theorem, 558, 584 formal nth roots, 513 
Eulerian Tour Rule, 145 Formal Chain Rule, 493 
Eulerian tours, 143, 148, 149 formal composition, 492 
evaluation homomorphisms, 388, 511 group axioms, 504 
evaluation map Rg, 492 formal derivatives, 493 
even permutations, 295 formal exponentials, 494 
events, 16 Formal Geometric Series, 484 
independent, 23 formal integrals, 512 
expansions of symmetric polynomials, 420 formal Laurent series, 510 
exponent notation, 280, 317 formal logarithms, 494 
Exponential Formula, 496 formal ordinary differential equations, 514 
exponential generating functions (EGFs), formal polynomials, 482 
193, 213 formal power series, 193, 481 
exponential of formal power series, 494 continuity, 482 
Extended Binomial Theorem, 197 infinite products, 483 
extended Ferrers board, 88 infinite sums, 483 
multiplicative inverses, 485 
face cards, 162 operations on, 199 
factorials, 8 formal powers, 512 
q-analogue, 332 Formal Quadratic Formula, 513 
falling, 90 Formal Quotient Rule, 512 
rising, 90 four-of-a-kind poker hands, 20 
falling factorials, 90 fraternity names, 6, 266 
falling rook polynomials, 522 frontier of a partition, 77 
fast binary multiplication, 237 full binary trees, 210 
Fermat’s Little Theorem, 101, 310 full house hands, 11 
Ferrers boards, 83, 521 Function Rule, 25 
Fibonacci numbers, xxiii, 100, 232 functional digraphs, 113, 147, 148 
q-analogue, 356 functions, 25 
closed formula, 91 acyclic, 152 
recursion for, 62 composition of, 28 
fields, 590 one-line form, 281 
construction of, 581 one-to-one, 25 
cyclic multiplicative subgroups, 576 onto, 26 
finite, 532 two-line form, 280 
multiplicative subgroups, 530 fundamental quasisymmetric basis of Q‘,, 
file rooks, 85 561 
fillings, 359 fundamental quasisymmetric expansion of 
content of, 360 8, 562 
finite additivity, 17 fundamental quasisymmetric polynomials, 
finite fields, 532, 576 560 
first subroutine, 239 multiplication rule, 585 
first-order shadow diagrams, 402 Fundamental Theorem of Algebra, 487 
fixed point sets, 318 Fundamental Theorem of Symmetric 
flags of subspaces, 582 Polynomials, 389 
flawed paths, 518 
flush hands, 11 gaps, 429 
Foata’s bijection, 344 Garsia—Milne Involution Principle, 189, 580 
forests, 122, 148, 149 General Union Rule, 159 


ordered, 502 Generalized Distributive Law, 96 
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generating functions, 191, 351 
and recursions, 199 
and summations, 203 
exponential, 213 

for ex, 419 

for hz, 419 

for binary trees, 209 

for connected graphs, 499 

for integer partitions, 219 

for inversions, 352 

for major index, 352 

for partitions, 229 


for trees, 229 
for weighted sets, 327 
multivariable, 356 
product of, 213 
generators of cyclic groups, 529 
geometric series, 53 
G-isomorphisms, 307 
Glaisher’s bijection, 223 
Glaisher’s Partition Identity, 224 
G-maps, 307 
graded algebras, 366, 591 
graded vector spaces, 591 
graph automorphisms, 317 
graph isomorphisms, 105, 148 
graph of a permutation, 402 
graphs, 103, 148 
k-regular, 112 
acyclic, 122, 148 
and perfect matchings, 565 
arboricity, 154 
automorphisms of, 290 
bipartite, 129, 148 
chromatic functions, 134 
chromatic numbers, 134 
colorings of, 133 
connected, 119, 148 
incidence matrix of, 152 
isomorphic, 105 
proper coloring, 148 
simple, 104 
Greek alphabet, 266 
group actions, 296 


and permutation representations, 300 


axioms, 318 

by left multiplication, 296 
conjugation, 296 
examples, 318 

main results, 318 


for Stirling numbers, 215-217, 229 


group automorphisms, 317 
group axioms, 277 
for formal composition, 504 
group homomorphisms, 317 
properties, 318 
theorems, 323 
group isomorphisms, 317 
groups, 277 
S4, 283 
Sy, 278 
Abelian, 277 
acting on a set, 296 
automorphisms of, 294 
axioms, 317 
basic properties, 316 
center of, 307, 318 
commutative, 277, 317 
conjugacy classes of, 302 
Correspondence Theorem, 323 
cosets, 303 
cyclic, 289, 317 
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Diamond Isomorphism Theorem, 323 


dihedral, 291 
Double-Quotient Isomorphism 
Theorem, 323 
examples, 316 
exponent notation, 280 
Fundamental Homomorphism 
Theorem, 323 
homomorphisms, 293 
isomorphisms, 293 
left cosets, 304 
products of, 320 
rules for inverses, 279 
symmetric, 278 
Universal Mapping Property, 323 
G-sets, 296 
G-stable subsets, 301, 318 


Hall scalar product, 412 
Hall’s Matching Theorem, 133 
hands, see poker hands 
hash functions, 525 
hash tables, 525 
Helly property, 154 
higher-order shadow diagrams, 404 
homogeneous polynomials, 366 
homomorphisms, 293, 592 
image of, 295 
kernel of, 295 
hook length, xx, 549 
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hook walk, 550 

Hook-Length Formula, xx, 549, 577 
hooks, xx, 549 

horizontal strips, 382 

hyperbolic trigonometric functions, 196 
hypercubes, 322 


ideals, 591 
in K[z] and K[[z]], 510 


identity element for formal composition, 492 


identity maps, 28 
iff (if and only if), 7 
image of a function, 25 
image of a homomorphism, 295, 317 
images, 592 
incidence matrices, 152 
Inclusion-Exclusion Formulas, xxii, 159, 
161, 182 
increasing subsequences, 555 
indegree, 112, 148 
independent events, 23 
index of a subgroup, 305 
index of nilpotence, 109 
induced subgraphs, 148 
Infinite Product Rule for Weighted Sets, 
484 
infinite products of formal series, 483 
Infinite Sum Rule for Weighted Sets, 484 
infinite sums of formal series, 483 
Injection Rule, 26 
injections, 25, 46 
inner automorphisms, 294, 317 
insertion tableaux, 400 
integer division, 241 
Integer Division Theorem, 271 
Integer Equation Rule, 31 
integer equations, 163 
integer partitions, 74, 359, 549 
k-quotients of, 438 
conjugate of, 76 
cores, 433 
diagrams, 75 
dominance ordering, 370 
Euler’s recursion for, 78 
frontier of, 77 
generating functions, 219, 220, 229 
in a box, 77 
lexicographic ordering, 369 
ranking, 253 
recursion for, 76 
self-conjugate, 101 


with distinct parts, 221 
with odd parts, 220 
integers modulo n, 278 
integral domains, 589 
interleaving bijection, 434 
inverse functions, 28 
inverse Kostka matrix, 374, 466, 476 
combinatorial interpretation, 467 
inversion tables, 333 
inversions, 283, 330, 352 
and q-factorials, 332 
and sorting, 317 
for tournaments, 546 
Involution Principle, 189 
Involution Theorem, 168 
involutions, 168, 183, 236 
fixed point set, 168 
sign-reversing, 168 
irreducible polynomials, 531, 576 
irreflexive relations, 82 
isolated vertices, 113, 148 
isometries, 413 
isomorphisms, 293 
iterated shadow diagrams, 404 


Jacobi Triple Product Identity, 431, 476 
Jacobi-Trudi Formulas, 464 

Jacobian, 389 

justified abaci, 429 


Konig—Egervary Theorem, 132 
k-content, 440 
k-cores, 433, 476 
uniqueness of, 437 
k-cycles, 282, 317 
kernels, 295, 317, 592 
kings, 584 
kite graph, 150 
Knuth equivalence, 554, 577 
Kostka matrix, 373, 466 
Kostka numbers, 367, 418 
symmetry property, 367 
triangularity property, 373 
k-permutations, 8 
k-quotients, 438, 476 
k-regular graphs, 112 
k-runner abaci, 434 


labeled abaci, 445 
labeled balls in labeled boxes, 32 
labeled Dyck paths, 526 
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labeled paths, 525 
Lagrange Inversion Formula, 505 
Lagrange’s Theorem, 305 
Lah numbers, 101 
Laplacian matrices, 141 
last subroutine, 239 
Lattice Path Rule, 34 
lattice paths, 33 

g-analogues, 352 

area statistics, 337 

in a rectangle, 66 

in a triangle, 66 

labeled, 525 
laws of algebra, 48 
Laws of Exponents, 280 
laws of exponents, 48 
leaves, 113, 148, 210 
left actions, 299 
left cosets, 304 
left inverses, 46 
left multiplication maps, 299 
left subtree of binary tree, 71 
left-to-right minimum, 357 
length of a partition, 75 
length of a word, 7 
lexicographic ordering, 248, 369 
license plates, 3 
limits of formal power series, 482 
linear combinations, 593 
linear independence, 593 
linear maps, 592 
linear orderings, 7 
lists of terms, 502 
Littlewood—Richardson coefficients, 469, 474 
Littlewood—Richardson Rule, 469, 476 
Log (complex logarithm), 196 
logarithm of formal power series, 494 
loops, 103 
lotteries, 16, 19 
Lucas numbers, 232 
Lucas’s Congruence, 310 


major index, 334, 352 
and q-factorials, 334 
for set partitions, 350 
of Dyck paths, 348 
major index table, 335 
matchings, 130, 148 
matrices, 539 
adjoint of, 543 
inverse of, 544 
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multiplication, 147 

nilpotent, 109 

powers of, 108 

product of, 107 

strictly upper-triangular, 109 

tensor products, 573 

transpose, 540 
matrix rings, 539 
Matrix-Tree Theorem, 141, 149 
merge sort, 237 
Mobius functions, 178, 183 

and chains, 179 

for Boolean posets, 178, 182 

for divisibility posets, 179, 182 

for posets, 184 

for product poset, 181 

for subspace lattice, 582 

for totally ordered poset, 178 

number-theoretic, 174 

recursions, 186 
Mobius Inversion Formula, 183 

for posets, 180, 184 

in number theory, 175 
monomial antisymmetric polynomials, 442 
monomial basis of A‘,, 366 
monomial basis of Ay, 419 
monomial basis of R[a], 90 
monomial basis of Q&,, 560 
monomial expansion of FQ; 5, 561 
monomial expansion of Schur polys., 369 
monomial quasisymmetric polynomials, 559 
monomial symmetric polynomials, 364 
monomials, 45 
multinomial coefficients, 14 

recursion for, 65 
Multinomial Theorem, 56 

non-commutative, 57 
multiplication rule for FQ,, 5, 585 
multiplication tables for groups, 278 
multiplicative inverses, 485 
and symmetric functions, 486 
coefficients of, 486 
multiplicity of roots, 487 
Multiset Rule, 30 
multisets, 30, 208 

recursion for, 64 


necklaces, 312 

Negative Binomial Theorem, 197 
q-analogue, 355 

neighbors of a vertex, 133 
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new box, 375 

next subroutine, 239 

nilpotent matrices, 147 

non-attacking rook placements, 84 

Non-Commutative Multinomial Theorem, 
57 

normal distributions, 520 

normalizer subgroups, 307, 318 

notation for counting, 38 


one-line form, 281 
one-pair poker hands, 21 
one-to-one correspondences, 26 
one-to-one functions, 25 
onto functions, 26 
operations on power series, 198 
orbit decomposition of a G-set, 303 
Orbit Size Theorem, 307 
orbits, 301, 318 
counting with weights, 315 
number of, 312 
size of, 307 
order 
formula for, 529 
of a group element, 295, 317 
properties, 323 
order of a formal power series, 482 
ordered forests, 502 
ordered set partitions, 101, 495 
ordered trees, 210, 500 
orderings on partitions, 418 
ordinary generating functions (OGFs), 193 
orthonormal basis, 412 
outdegree, 112, 148 


Pélya’s Formula, 314, 316 
palindromes, 14 
parentheses, 71 
parking functions, 523, 576 
circular, 524 
counting, 524 
partial fractions, 487 
partial orders, 177, 184 
and strict orders, 177 
Partial Permutation Rule, 8 
partial permutations, 8 
ranking, 265 
partially ordered sets, 177 
partite sets, 129 
partition diagrams, 359 
partition division, 439 
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partition identities, 224 
Partition Recursion, 226 
involution proof, 236 
partitions, see set partitions or integer 
partitions 
parts of a partition, 74 
Pascal’s Identity, 52, 58, 63 
Pascal’s Triangle, 63 
Path Rule for DAGs, 111 
paths, 105, 148 
paw graph, 150 
Pentagonal Number Theorem, 225, 229, 432 
P-equivalence, 554 
perfect matchings, 565 
and Pfaffians, 567 
on complete graphs, 566 
permanents, 583 
permutation cipher, 39 
permutation representations, 299, 318 
and group actions, 300 
Permutation Rule, 8 
permutations, 7 
T-avoiding, 73 
231-avoiding, 72 
Cartesian graph of, 402 
cycle structure, 147 
disjoint, 321 
even, 295 
notation for, 316 
ranking, 249 
successor algorithm, 259 
Petersen graph, 157, 322 
Pfaffians, xxiv, 564, 578 
and determinants, 567 
and perfect matchings, 567 
row expansion, 564 
Pieri Rules, 379, 382, 419 
and skew shapes, 416 
for ex, 451 
for hz, 453 
for py, 449, 454 
placements of rooks, 84 
poker, 20 
poker hands, 4 
flush, 11 
four-of-a-kind, 20 
full house, 11 
one-pair, 21 
straight, 4 
three-of-a-kind, 43 
two-pair, 41 
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Pollard-rho algorithm, 152 
polynomial identities, 54 
polynomials 
coefficients and roots, 392 
evaluation homomorphisms, 388, 511 
irreducible, 531 
root of, 487 
posets, 177, 184 
and chains, 184 
and Mobius functions, 184 
Boolean, 177 
divisibility, 177 
isomorphism of, 181 
of DAGs, 186 
products of, 181, 184 
totally ordered, 177 
power series 
of exponential function, 196 
of trigonometric functions, 196 
operations on, 198 
radius of convergence, 192 
Power Set Rule, 10 
power sets, 10 
power-sum basis of Ay, 390 
power-sum symmetric polynomials, 316, 363 
algebraic independence, 389 
and e,, 397 
and h,,, 396 
definition, 365 
recursion involving ez, 394 
Schur expansion, 456 
summary of facts about, 419 
powerball, 19 
powers of formal power series, 512 
Priifer codes, 158 
predecessor algorithms, 272 
probability, 16, 17 
classical, 17 
conditional, 21 
equally likely outcomes, 17 
non-equally likely outcomes, 18 
probability definitions, 39 
probability measures, 17, 18 
product groups, 320 
product map Pn,m, 241 
product map Pn,,....n,, 244 
product of formal power series, 481 
product of maps f x g, 243 
product of several generating functions, 213 
product posets, 181, 184 
Product Rule, 3 
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Bijective, 48, 243, 246 
for EGFs, 214 
for successor algorithms, 261 
for Weighted Sets, 351 
proof, 36 
Product Rule for Derivatives, 512 
Product Rule for Weighted Sets, 205, 329 
infinite version, 484 
proper colorings, 133 
proving bijectivity, 29 
pruning, 124 


Quadratic Formula for formal series, 513 
quasisymmetric polynomials, 559, 577 
quaternions, 581 

quotient, 241 

quotient groups, 323 

Quotient Rule for formal series, 512 


radius of convergence, 192 
formulas for, 194 
raising operators, 371 
ranking, 264 
anagrams, 252, 265 
integer partitions, 253 
partial permutations, 265 
permutations, 249 
set partitions, 254 
subsets, 250, 251, 265 
trees, 256 
words, 246 
ranking algorithms, 239 
Ratio Test, 194 
rational-slope Dyck paths, 576 
counting, 515 
reading words, 553 
rearrangements, 7 
recording tableaux, 400 
recursions, 51 
and generating functions, 199 
for g-analogues, 352 
for g-Catalan numbers, 348 
for g-multinomial coefficients, 342 
for q-Stirling numbers, 350 
for 231-avoiding permutations, 72 
for Bell numbers, 80 
for binary trees, 71 
for Catalan numbers, 70 
for coefficients of compositional inverse, 
504 


for coefficients of multiplicative rook polynomials, 521 
inverses, 485 rook-equivalence, xxiii, 521, 576 
for divide-and-conquer algorithms, 237 rooks, xxi, 83 
for Fibonacci numbers, 62, 91 non-attacking placement of, 84 
for integer partitions, 76, 226 relation to Stirling numbers, 84 
for Mobius functions, 186 Root Test, 194 
for multinomial coefficients, 65 rooted spanning trees, 148 
for multisets, 64 rooted tree, 148 
for partition numbers p(n), 78 Rooted Tree Rule, 117 
for paths in a rectangle, 66 roots of formal power series, 513 
for paths in a triangle, 66 roots of polynomials, 392, 487 
for Stirling numbers, 79, 218 row set of a hook walk, 550 
for subsets, 62, 63 row word of A, 408 
for symmetric polynomials, 420 RREF basis, 533 
relating e; and h;, 390 RREF matrices, 533 
with constant coefficients, 90, 490 RSK correspondence, 400, 420 
recursive formulas, 61 and descent statistics, 406 
reduced row-echelon form (RREF), 533 and inversion, 405 
Reflection Principle, 36 for biwords, 408 
reflexive relations, 82 for matrices, 409 
regular graph, 112 for permutations, 400 
relations for words, 405 
definitions, 82 
matrix of, 177 sample spaces, 16 
partial order, 177 Schur basis, 373 
strict order, 177 Schur expansions 
remainder, 241 for words, 407 
removal of border ribbon, 434 of €q, 387 
restricted words, 214 of ha, 385 
reversal map, 29 Schur symmetric polynomials, 361 
reverse bumping path, 378 and €q, 387 
reverse bumping sequence, 378 and hq, 385 
reverse tableau insertion, 377, 378 and antisymmetric polys., 453, 458 
ribbons, 433 fundamental quasisymmetric 
sign, 448 expansion, 562 
spin, 448 monomial expansion of, 369 
right composition map Re, 492 power-sum expansion of, 457 
right cosets, 303 summary of facts about, 419 
right group actions, 298 secant power series, 536 
right inverses, 46 semistandard tableaux, 359 
right multiplication maps, 320 series expansion of Ey (t), 392 
right subtree of binary tree, 71 set partitions, 82, 495 
rim-hook tableaux, 455, 476 g-analogues, 350 
special, 467 blocks of, 79 
ring homomorphisms, 592 definition, 79 
rings, 589 ordered, 101 
rising factorials, 90 ranking, 254 
R-linear maps, 540 shadow diagrams, 404 
Rogers-Ramanujan Identities, 224 shadow lines, 402 
rook placements, 5, 6, 84, 356 and RSK, 403 


non-attacking, 521 shifts, 559 
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shuffles, 585 
sign, 283 
and sorting, 317 
for tournaments, 546 
main properties, 286 
sign of a ribbon, 448 
sign of labeled abaci, 445 
simple digraphs, 105, 148 
simple graphs, 104, 148 
sinks, 112 
skew fillings, 415 
skew Kostka numbers, 416 
skew Pieri Rules, 416 
skew Schur polynomials, 415, 461 
and w, 462 
expansion using e,, 464 
expansion using hy, 464 
monomial expansion of, 416 
power-sum expansion of, 461 
Schur expansion of, 469 
symmetry of, 416 
skew shapes, 414 
skew tableaux, 415 
skew-symmetric matrices, 563 
solving recursions, 90 
sorting by comparisons, 47 
sources, 112 
spanning sets, 593 
spanning trees, 138, 148 
recursion for, 139, 141, 149 
rooted, 140 
special rim-hook tableaux, 467 
spin of a ribbon, 448 
squares attacked by rooks, 83 
stabilizers, 306, 318 
standard tableaux, xx, 360, 549 
descent count, 406 
descent set, 406 
major index, 406 
random generation, 550 
statistics, 327 
on Dyck paths, 347 
on standard tableaux, 406 
Stirling numbers 
g-analogues, 350, 357 
and Exponential Formula, 498 
and involutions, 170 
and transition matrices, 90 


first kind, 86, 116, 134, 147, 216, 357 
generating functions, 216, 217, 229 


polynomial identities, 87-89 


recursions for, 79, 86, 218 
relation to rooks, 84 
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second kind, 79, 183, 215, 217, 498 


summation formula, 89, 164 
Stirling permutations, 350 
straight poker hands, 4 
straight shapes, 414 
strict orders vs. partial orders, 177 
strict partial orders, 151, 177, 184 
strictly decreasing words, 28 
strictly increasing words, 28 


strong components of a digraph, 121 


subalgebras, 591 
subgraphs, 138, 148 
induced, 138 
subgroups, 287, 317 
cyclic, 317 
generated by a set, 321 
index of, 318 
normal, 288 
of Z, 288 
of cyclic groups, 529 
of fields, 530 
products of, 322 
subposets, 186 
subrings, 591 
Subset Rule, 11 
subsets, 10, 28 
ranking, 250, 251, 265 
recursion for, 62, 63 
successor algorithm, 260 
unranking, 251 
subspaces, 591 
counting, 532, 576 
successor algorithms, 239 
Bijection Rule, 257 
for [In], 257 
for anagrams, 258 
for permutations, 259 
for subsets, 260 
successor maps, 264 
Successor Product Rule, 261, 265 
for Two Sets, 260 
Successor Sum Rule, 257, 265 
Sudoku Theorem, 320 
sum of binomial coefficients, 51 
sum of formal power series, 481 
sum of geometric series, 53 


sum of squared binomial coefficients, 53, 58 


Sum Rule, 5 
bijective, 241 
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for ranking, 241 
for successor algorithms, 257 
for Weighted Sets, 351 
proof, 36 
Sum Rule for Weighted Sets, 205, 329 
infinite version, 210, 484 
summary of counting rules, 38 
Summation Formula for Derangements, 204 
summations and generating functions, 203 
sums of powers of integers, 61, 232 
support, 321 
Surjection Rule, 81, 163 
surjections, 26, 46, 81, 183, 215 
Sylow’s Theorem, 311 
Sylvester’s Bijection, 222 
symmetric functions, 416 
and multiplicative inverses, 486 
symmetric groups, 278 
symmetric polynomials, 363 
and algebraic independence, 419 
and antisymmetric polynomials, 444 
bases for, 419 
complete, 364, 365 
elementary, 364, 365, 387 
expansions in other bases, 420, 475 
forgotten basis, 399 
fundamental theorem, 389 
monomial, 364 
power-sums, 363, 365, 390 
recursions for, 420 
scalar product of, 412 
Schur basis, 373 
symmetric relations, 82 
symmetry of binomial coefficients, 52 


tableau insertion, 374, 419 
tableaux, 359 

and abaci, 476 

reading word, 458, 553 
tangent power series, 536 
tensor product of matrices, 573 
terms, 501 
ternary trees, 235 
Texas Hold ’em, 47 
Theorem on Coefficients of Multiplicative 

Inverses, 486 

Theorem on Degree and Order, 482 
Theorem on Formal Composition, 492 
Theorem on Formal Continuity, 482 
Theorem on Formal Derivatives, 493 
Theorem on Formal Exponentials, 494 
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Theorem on Formal Logarithms, 494 
Theorem on Invertible Polynomials and 
Formal Series, 485 
Theorem on Power Series and Analytic 
Functions, 195 
Theorem on Radius of Convergence, 194 
Theorem on Splitting a Denominator, 487 
three-of-a-kind poker hands, 43 
tilings, xxiii 
tossing an unfair coin, 24 
totally ordered sets, 177 
tournaments, 546, 577 
criterion to be transitive, 547 
generating function for, 546 
kings in, 584 
transitive, 547 
weights and signs, 546 
transition matrices, 90 
and Stirling numbers, 90 
transitive relations, 82 
transpose of a matrix, 540 
transpositions, 283, 317 
and sorting, 284 
basic, 283 
Tree Rule, 125 
and parking functions, 527 
trees, 123, 148, 149 
bijections, 211 
enumeration of, 149 
ordered, 210, 500 
rooted, 114 
ternary, 235 
trigonometric functions, 196 
truncated Laplacian matrix, 141 
truth function x, 169, 183 
two-line form, 280 
two-pair poker hands, 41 
two-sided inverses, 28 
two-to-one functions, 44 
types of braces, 30 


uniform distribution, 520 
Union Rule 
for three sets, 15 
for two sets, 15 
Union-Avoiding Rule, 161 
unique readability of lists of terms, 502 
unique readability of terms, 501 
uniqueness 
of evaluation homomorphisms, 511 
of identity in a group, 279 
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of inverses in a group, 279 
Universal Mapping Property, 323 
unranking, 264 

subsets, 251 
unranking algorithms, 239 
up-down permutations, 536 


Vandermonde determinant, 390, 442, 548, 
577 
vector spaces, 590 
basis, 593 
dimension, 593 
homomorphisms, 592 
verifying polynomial identities, 54 
vertex covers, 131, 148 
vertical strips, 382 


Walk Rule, 108 
walks, 105, 147, 148 

closed, 105 
weakly decreasing words, 28 
weakly increasing words, 28 
weight of domino tiling, 570 
weight of labeled abaci, 445 
weight-additivity condition, 205, 329, 484 
weight-preserving bijections, 205, 329, 351 
Weight-Shifting Rule, 330 
weighted sets, 191, 327, 351 
Wilson’s Theorem, 325, 581 
Word Rule, 7 
words, 7, 214, 501 

decreasing, 28 

empty, 7 

increasing, 28 

monotone subsequences, 577 

ranking, 246 

rearrangements of, 12 

weight of, 501 

weighted by major index, 407 
WZ-method, 102 


Young tableaux, 360 


